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This  letter  report,  for  the  period  of  1  October  1993  through  1  April  1994,  describes  the 
activities  supported  by  the  grant  in  three  sections:  research  progress,  student  profile.  Pi’s 
activity. 

1.  Research  Progress 

We  have  been  quite  successful  in  getting  our  work  known  to  the  community  in  the  past 
few  months.  Specifically,  in  the  several  conferences  we  participated,  people  have  shown 
great  interests. 

A.  Fault  Modeling  and  Testing  of  Complex  Systems 

This  project  has  gotten  some  recognition  in  both  fault  modeling  and  testing  of  optical 
interconnects  in  a  opto-electronic  system.  In  the  first  part,  we  want  to  develop  a  new 
methodology  for  performance  analysis  of  opto-electronic  systems  at  a  higher  level.  Our 
approach  is  to  Hrst  identify  possible  failures  in  such  interconnect  implementations,  and  then 
extract  information  from  the  physical  configuration  and  relate  to  the  system  performance 
parameters.  In  this  way,  system-level  performance  degradation  can  be  estimated  to  construct 
design  constraint  for  physical  systems.  One  analysis  on  link  failures  in  free-space  optical 
interconnects  is  published  in  the  Proceedings  of  the  SPIE  94,  January  conference  (Exhibit  I)* 

In  the  second  part  on  testing,  we  proposed  an  architecture  which  integrates  the  concept  of 
concurrency  and  distributed  test  pattern  generation  for  testing  complex  circuits  on  a  planer 
layout.  This  approach  perfonps  test  pattern  generation  and  response  analysis  concurrently. 
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thus  minimizing  testing  time  for  the  overall  system.  Our  approach  fully  utilizes  the 
fundamentals  of  BIST  circuitiy  and  conducts  test  pattern  generation  and  output  analysis 
concurrently.  The  result  is  published  in  IEEE  Design  and  Test  of  Computers,  vol.  10.  no.4, 
pp38-51  (Exhibit  II).-Circuits  are  partitioned  into  segments  which  can  be  tested  in  parallel  in 
testing  time  bounded  by  2^n  clock  cycles,  where  n  is  the  maximum  no.  of  inputs  for  the 
biggest  cluster.  The  result  of  this  study  will  appear  in  1994  Custom  Integrated  Circuits 
Conference  (Exhibit  III).  Further  hardware  reduction  can  be  reduced  by  binding  test  circuitry 
to  functional  units  through  logic  synthesis  and  optimization  algorithms.  We  are  currently 
evaluating  the  impact  of  test  pattern  generation  issue  for  the  proposed  architecture,  and  the 
result  has  been  accepted  to  the  1994  International  Conference  of  ASIC.  Simulations  are  also 
underway  on  various  benchmark  circuit  to  better  observe  its  overall  performance  on  realistic 
systems. 

B..  Performance  Evaluatio.:  of  Fault  Tolerant  Optical  Multi-stage  Interconnection 
Network 

This  project  helps  to  sCi  up  an  evaluation  scheme  for  complex  degradable  system.  We  think 
the  uncertainty  of  optical  interconnects  supporting  fault  tolerance  can  easily  be  modeled  in 
this  frameworic.  An  event-driven  simulator  has  been  developed  to  study  the  corresponding 
measures  as  the  replicated  or  dilated  banyans  degrade  under  component  failure.  A  fault- 
injector  is  used  to  inject  faults  under  some  given  distribution  and  to  degrade  the  network 
accordingly.  In  the  multipath  MINs,  different  components  are  given  different  MTTF  for  a  4- 
replicated  banyan  of  size  64  x  64  implemented  with  2x2  switches.  In  practice,  this  can 
reflect  a  64  processor  64  module  shared  memory  multiprocessor  environment  (one  of  a 
moderate  size),  with  exactly  four  distinct  paths  between  any  PE-MM  pair.  A  set  of  design 
parameters  needs  be  identified  at  the  system  level  (as  opposed  to  the  device  level),  that  can 
be  used  to  tune  up  the  system  behavior  to 

reflect  upon  the  possible  effects  of  technology.  The  design  parameters  are: 

•  topology 

•  level  of  redundancy 

•  component  lifetime 

•  switching  and  timing 

•  source  behavior 

A  detailed  discussion  on  the  choice  for  optical  network  based  on  these  design  parameters  is 
in  the  book  chapter  submitted  for  Kluwer  publication.  We  have  some  result  on  a  general  N 
component  system  which  has  been  submitted  to  1994  SPDP  symposium  (Exhibit  IV).  More 
experiments  designed  to  give  us  more  insight  to  the  degradation  process  include  a 
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convergence  test  to  validate  simulation  results  with  a  number  of  fault  patterns  injected  and 
some  time  of  running  dme;  two  other  experiments  designed  to  observe  the  effect  of 
submission  rate  to  mean  reward  before  failure  (MRBF)  and  the  mission  time.  Results  on 
replicated  network  has  been  encouraging,  we  are  working  on  getting  results  for  the  dilated 
netwoiic  to  be  published  in  a  journal  submission. 

2.  Student  Profile 


Amiya  Bhattacharya  is  woridng  on  the  simulator  for  performability  evaluation.  Huoy-Yu 
Liou  continues  to  develop  the  framework  for  the  test  architecture.  She  will  be  taking  her 
research  exam  early  summer.  The  two  new  graduate  students  (De-Yu  Kao,  and  Li-Cheng 
Tai)  are  still  taking  some  courses.  However,  De-Yu  passed  the  comprehensive  exam  this 
Spring,  which  means  he  will  devote  more  time  to  research  from  this  point  on.  His  immediate 
project  is  to  design  the  controller  for  the  testing  architecture. 

3.  Pi’s  Activity 

Other  than  working  with  the  students  on  the  above  mentioned  projects,  I  have  been 
involved  in  the  ASIC  and  Testing  communities.  I  presented  our  paper  on  "Reconfiguration  of 
Fault-Tolerant  Linear  Processor  Arrays"  at  the  International  Conference  on  Parallel  and 
Distributed  Systems  (ICPADS’93),  December  1993.  Presented  the  "Analysis  of  Link  Failures 
in  Free-Space  Optical  Interconnects"  at  SPIE  Optoelectronic  Interconnects  Conference, 
January  1994. 1  have  been  invited  to  the  IEEE  BIST/DFT  workshop  to  be  held  later  this 
month  (April  19-22)  to  present  our  work  on  the  pipelined  testing  architecture.  I  also  have 
received  invitation  to  present  my  woik  at  the  M(^  Test  Workship  this  September.  I  have 
also  been  elected  to  the  program  committees  for  the  Seventh  Annual  IEEE  International 
ASIC  Conference,  the  1994  Symposium  on  Parallel  and  Distributed  Systems,  and  the  Fourth 
International  Conference  on  CAD  &  CG,  1995. 

The  above  detailed  my  recent  research  activities.  If  there  are  issues  omitted  or  not  clear, 
please  do  not  hesitate  to  call  me.  Thank  you.  y  .announced 


Best  regards. 


Ting-'MgY.Lin 
(619)  534-4738 
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To  appear  In  SPIE  Optical  Interconnect  Conference,  January  1994 


Analysis  on  Link  Failures  in  Free^pace  Optical  Interconnects 

Huoy-lAi  Liou  and  Tlng-TIiig  Lin 

Department  ef  Electrical  and  Computer  Engineering, 

University  cf  Cai^ortua  at  San  Diego, 

LaJoUa,  Califonda  92093 
liou9ece.ucsd.edu,  lin9ece.ucsd.edu 
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ABSTRACT 

Free  space  optical  interconnect  has  provided  a  promising  solution  to  the  effective  signal  links  of  the  increasing  density  and 
complexity  in  veiy-laige*scalolaige-scale  integrated  circuits.  It  is  getting  less  affordaUe  if  such  a  system  fuls  just  fw  one  tiny 
physical  defect  Our  analysis  on  the  potential  optical-electrical  link  feilure  provides  guidelines  for  future  testing  and  reliable 
system  design.  The  study  of  fault  nxidels  starts  by  exfdoring  the  underlying  i^ysical  mal-funcdoning  of  the  opto-electrical 
components,  aixl  their  inqiMCts  on  the  assembled  systems.  We  m^  the  ^ysical  defects  of  opto-electronic  devices  into  their 
corresponding  logic-level  representation  fm  higher  level  design  consideradmi.  This  mapping  is  chosen  for  its  compatibility 
and  practicability  with  digital  electronic  system  designs  where  automated  design  tools  can  expedite  the  design  optimization 
and  verification  process. 


1.  INTRODUCTION 

As  the  integrated  circuits  scale  down,  switching  speed  and  the  conq)lexity  go  up,  the  demand  for  high  density,  larger  band¬ 
width  and  reconfigurable  interconnects  becomes  crucial  for  inte^ated  circuits  with  better  performance  and  hi^er  yield.  The 
free-space  optical  link  in  place  of  conventional  metal  wiring  in  semiconductor  designs  has  been  under  research  for  over  a 
decade  [1].  Research  worics  strive  to  show  that  feee-space  (^tical  interconnect  provides  one  of  the  best  solutions  for  beyrmd 
Gb/s  performance  ULSI  ICs  and  systems  [2][3].  The  reasoning  is  based  on  the  compatibility  between  semiconductm  arxl  opti¬ 
cal  processing  technologies  and  the  potentid  conqtetitive  performance  of  optical  interconnection  over  metal  wiring  [4][S]. 
Therefore,  it  is  essential  that  physical  fulures  in  these  optical-electronic  hybrid  design  schemes  are  understood. 

In  this  paper,  we  first  review  current  feee-space  optical  intercoimect  schemes,  followed  by  the  proposal  of  a  fault  model  fcx 
opto-electronic  hybrid  systems.  This  is  achieved  by  mapping  the  physical  feilures  of  opto-electronic  systems  into  its  {Hclimi- 
nary  logical  representation  in  a  conqnebensive  manner.  A  short  introduction  to  the  existing  feult  models  for  the  electronic 
world  is  provided  f<x  quick  refinenoe.  Hnally  we  take  two  sample  systems  for  the  denx>nstiatioo  of  the  mapping.  At  this 
moment,  we  will  focus  on  modeling  of  time-invariant  permanent  feults  where  physical  feilures  of  conqwnents  result  in  perma¬ 
nent  system  failure. 

2.  CURRENT  IMPLEMENTATIONS  OF  FREE-SPACE  OPTICAL  INTERCONNECT 

Current  implementations  of  feee-space  optical  interconnects  can  be  categorized  into  two  major  feunilies  [5].  The  first  one 
is  characterized  as  the  “modulator”  scheme  since  it  adopts  external  (off-processor  or  off-chip)  light  sources  spatia^planar 
light  nxxlulators  as  the  control  of  the  optical  paths.  One  exanq)le  is  the  optical  interconnect  scheme  using  the  SEED  devices 
[6][7][8].  The  second  femily  is  called  the  “light  source”  scheme,  which  relies  on  micro-lasers  [19]  and  photo-conductive 
media/wave  guides  for  one-to-one  or  one-to-many  connections.  Some  prototypes  utilizing  mictolaser  array  and  planar  optics 
have  been  proposed  [9][10]. 

In  the  “noodulator”  scheme,  the  photodiodes  are  built  in  the  cormnunicating  processots/devices.  The  data  and/or  the  con¬ 
trol  signals  are  provided  by  external  light  sources  such  as  the  power-supply  lasers.  The  ouq>uts  feom  each  processing  element 
are  read  out  using  the  read-out  laser  beam.  Interconnection  pmtems  are  generated  either  feom  a  pce-feibricated  hologram  or  a 
spatial-varying  array  of  holograms.  The  photo-refeactive  intUces  of  these  holograms  are  subject  to  changes  depending  on  the 


materiib.  An  i^Iied  electro-imigDetic  field,  e.g..  a  contnd  vdtage  or  changes  of  wavelengdi  of  lefeienoe  laser  beam(s),  can 
be  the  {tatlem  selected'.  Reconfigurability  of  the  interconnects  is  thus  possible  on  the  pattern-selection  basis  [14].  Figure  1 
shows  an  abstract  picture  of  such  scheme.  Here  we  use  the  intensity  rqiresentation  for  the  digital  optical  inqriementation  as 
described  in  [1]. 

For  the  “light  source"  scheme,  each  device^nocessor  utilizes  micro-lasets  as  outputs.  The  micto-lasets  emit  lights  accord¬ 
ing  to  the  assigned  logical  values  at  Ae  ouqiuts  of  the  inocessor. 

In  the  photo-refitactive  inq>Iementation,  the  ioteiconnection  media  can  be  either  a  substrate  htdogtam  or  sinqily  a 
waveguide  connecting  the  ou^uts  of  one  device  to  the  ii^t(s)  of  the  other  [11][12].  Interconnection  patterns  can  be  i^u- 
lated  also.  An  q)plied/built-in  spatial-variant  electromagnetic  field  to  the  waveguide  substrate  or  pbotorefractive  gratings  will 
change  the  diffiracting  holograms  too  [1S][12].  Those  multiplexed  {rianar  holograms  can  provide  dynamic  recrafiguratde  inter¬ 
connect  patterns  [15].  Figaxt  2  shows  a  symbolic  sketch  of  this  scheme  also  using  the  intensity  representation. 

These  two  schemes  not  only  are  Ae  major  rquesentatives  of  current  studies  but  are  based  on  fundamental  optics.  There¬ 
fore,  we  chose  them  as  the  main  inq>Iementations  in  our  discussion  of  fault  modeling. 

3.  PRESUMPTIONS  AND  FAULT  MODELS  IN  THE  DIGIIAL  REPRESENTATION 

3.1.  Presumptions 

A  few  assunq)tions  should  be  made  to  better  illustrate  the  modeling  process.  It  is  inq>licitly  conceived  that  analog  optics  is 
not  under  consideration  in  this  article  even  Aough  some  of  our  discussion  may  ai^Iy  A  the  analog  domain. 

First,  the  mechanical  alignment  problem  between  optical  components  is  of  majw  concern  in  today's  application  of  optics. 
However,  two  promising  techniques  which  are  based  on  existing  semiconduett^  techndogy  were  develcqied  to  circumvent 
alignment  Afficulties:  the  self-aligned  solder  bumping  [17]  and  silicon  micromachined  etching  [18].  The  former  suggests  a 
sub-micron  alignment  Aleranoe  in  the  planar  integration  of  the  t^tical  components  [11]  and  boA  techniques  suggest  that  all  of 
Ae  micro-lasers  and  photodiodes  are  to  be  put  abutted  to  the  i^to-refiractive  medium  A  obtain  the  best  alignment  and  area 
efficiency  [13].  Therefore,  we  assume  that  the  mechanical  alignment  problem  can  be  solved,  and  we  will  focus  on  other  ran¬ 
dom  faults. 

Secondly,  the  specific  technology  of  our  concern  must  be  silicon-compatible.  This  is  because  the  inAgration  of  tpto-elec- 
tronic  components  is  imp(»tant  to  msdee  themselves  competitive  wiA  their  electrical  counterpart  on  performance  consideration 
[25]. 

Thirdly,  we  will  concentraA  on  iiK>delmg  of  the  permanent  physical  faults  for  now.  An  analysis  of  the  physical  faults  in 
each  opto-electronic  conponent  will  be  performed  m  the  following  A  determine  the  combined  effect  of  these  faults  in  the  dig¬ 
ital  domain.  Soft  and  transient  faults,  alAough  more  fiequent,  are  sysAm-dependent  Further  investigation  is  needed  Am- 
poral  sysAm-related  failure. 

All  the  inArconnect  faulA  are  transformed  A  the  corresponding  diptal  faulA  and  are  presented  A  the  inpuA  of  the  receiv¬ 
ing  processorfs)  or  devices.  Figure  3  illustraAs  a  logic  representation  of  a  opto-electronic  sysAm  using  the  logic  level  reine- 
sentation. 

3.2.  Logic-Level  Fault  Models  for  Permairent  Failures  in  Electronic  D^tal  SysAms 

Before  we  investigaA  the  possible  physical  failures  of  the  opto-electronic  componenA,  a  quick  review  of  the  popular  fault 
models  in  the  electronic  world  is  provided  It  will  benefit  the  large-scale  opto-electronic  hybrid  designs  if  we  can  a^t  Aose 
techniques  into  our  optical  mArconnect  scheme  by  a  successful  nupping  tom  tbe  physical  failure  of  such  sysAms  to  the  cor¬ 
responding  digital  represenAtion. 

TWo  major  classes  of  logic-level  ftiult  models  of  the  permanent  physical  failures  in  electronics,  sAck-at  and  bridging  [26]. 
The  stuck-at  model  simply  describes  the  logical  values  of  the  mput/oupot  of  a  gate/block  is  tied  A  0  or  1.  Whereas  the  bridg¬ 
ing  faulA  represent  the  cases  of  two  adjacent  wires  bemg  physically  shmted  A  one  signal  line  Aus  issumg  one  logic  value. 
Those  two  models  gain  their  popularity  because  they  represent  the  majority  of  the  hardware  failure  modes  m  the  LSI/VLSI 
technology  [16].  And  testing  techniques  based  on  these  models  can  provide  satisfacAry  fault  coverage  as  well  as  efficient  test¬ 
ing  time.  However  in  CMOS  technology,  another  fault  model  is  tbe  studt-open  which  represenA  an  (pen  connection  in  tbe  cir¬ 
cuit.  This  fault  can  transfer  a  combinational  circuit  inA  a  sequential  machine  which  still  can  be  simulated  on  tbe  gate/logic 
level  [26]. 

Fault  simulation  and  testing  techniques  rely  on  effective  fault  models.  A  sinple  logic-level  fault  model  can  help  speeding 
up  the  testing  process  through  tbe  auAmated  fault  simulation  and  test  pattern  generation.  Only  by  test  auAmation  can  we 


move  onto  a  mote  coct*effiBGtive  design  cycle  of  large-scale  circuits  without  sacrificing  quali^. 

3^  P^yskal  Defects  in  Optical  CompoacBts 

An  outline  of  possible  physical  fiinlts  of  both  active  and  passive  devices  is  given  below.  In  particular,  physical  defects  in 
the  febiication  process  of  (qito-electronic  devices  are  investigtted.  Their  in^acts  on  the  (figital  domain  ate  discussed  individu- 
ally. 

Active  Devices 

*  Processing  defects  of  micro-lasers  can  shift  their  resonating  nuxle  which  may  reduce  the  illuminating  intensity  of  the 
idiotodiode  [20].  Low  intensity  as  a  result,  may  not  trigger  the  threshold  current  in  order  to  have  a  logic  true  ouqwt  This 
can  be  view^  as  a  stuck-at-O  fault  at  the  irqiut  of  the  succeeding  processor.  On  the  other  hand,  a  constant  high  electrical 
wire/cont«:t  will  activate  the  resonator  which  can  be  viewed  as  a  stiick-at-l  feult 

*  Abrxnrtud  impurity  concentration/diffusion  in  modulatms  and  fim-in/fen-out  generates  vrill  give  too  weak  (below  logic 
level  low  threshold)  or  too  strong  (above  logic  level  bi^  threshold)  light  beams  under  normal  <q>eration  which  sppeats 
as  the  stuck-at  faults  to  tiie  inputs  of  die  succeeding  ptocessor(s). 

*  Physical  defects  in  photodiodes  will  result  in  stuck-at  feults.  This  can  be  explained  by  the  fiut  that  uneiqiected  inpurity 
concentration  or  crystallioe  inperfection  will  adjust  the  energy  barrier  of  the  proposed  Scbottky  diodes  [24]  or  the  built- 
in  voltage  which  tadces  part  in  off-setting  the  total  photo-induMd  current  of  the  p-i-n  diodes  [21]. 

*  Defects  in  the  micro-lasers^ihotodetectm  can  intr^uce  fiuctuation  of  the  recovery  time  which  gives  rise  to  delay  feults. 
It  has  been  derived  that  the  tum-on  time  of  the  semiconductor  laser/diode  is  proportional  to  the  carrier  life  time  which  is 
relevant  to  the  impurity  omoentration  [21]. 

Passive  Devices 

*  The  defects  of  the  micro  lenses/gratings  or  diffractive  (piical  elements  [13]  will  give  less  reflected  intensity,  shown  in 
Figure  4a  as  the  stuck-at-low  feults.  Whereas  in  Hg.  4b,  too  much  reflected  light  results  in  stuck-at-high  or  bridging 
feults  where  two  or  more  si^ials  ate  collected  to  one  photodetector  by  a  faulty  array  of  grating.  It  is  possible  when  the 
array  of  diffracting  conponents  fail  to  link  between  two  processing  elements,  such  that  no  signal  is  present  at  the 
receiving  end.  We  call  this  "link-open’'  feult,  and  designate  an  unknown  state  in  our  future  sirmilation. 

*  The  overwriting  problem  of  the  bulk  and  planar  holograms  during  read  out  [12]  can  contribute  to  die  permanent  stuck  or 
bridging  feults.  Transient  failure  can  be  expected  with  the  fluctuation  of  tenperature,  voltage  or  laser  power. 

*  Potential  crosstalk,  light  scattering  and  noise  in  the  optical  fibaAvaveguide  [22][23]  can  result  in  transient  fx  laidging 
feults.  (However,  this  can  be  avoided  by  introducing  new  Design  Rules  between  the  two  light  paths.) 

The  above  list  is  an  outline  of  the  most  common  physical  feults  in  current  inplementation.  We  will  illustrate  the  models 
using  the  following  applications. 


3.4.  Engineerin^Performance  Parameters  Representing  Physical  Failures 

There  are  several  conunon  engineering^ierfcxmance  parameters  in  designing  the  ftee-pace  optical  interconnects.  Here 
we  try  to  link  the  parameters  with  digital  fault  models  to  nuke  a  quantitative  representation  of  the  permanent  feults  for  system 
design  evaluation.  This  is  particularly  helpful  for  optical-electric  system  designers  to  conduct  system  level  simulation  for  auto¬ 
mated  design  planning.  Since  ftee-space  optical  interctxinect  has  to  handle  massively  parallel  interconnection,  some  of  the 
parameters  are  included  for  array-type  digital  interconnection  in  the  system.  The  discussion  is  itemized  as  following: 


*  Misalignment  [27]  -  Deviation  in  incident  light  wavelength,  lenslet  focal  length,  image  plan  tilt,  non-planar  light  wave 
curvature,  and  distance  between  lenslet  and  image  plan,  can  cause  misalignment  Alignment  failure  (i.e.,  non-tolerable 
deviation  of  the  above  physical  quantities  in  designing  a  system)  in  the  laser  array  generator  can  result  in  bridging/ 
wired-or  ("cross-talk’O.  stuck-at,  or  link-open  faults. 

*  Light  Throughput  ["iZySystem  Power  Transmission  [27]  —  The  light  efficiency  of  a  ftee-space  optical  interconnect  sys¬ 
tem  can  be  defined  as  the  jModuct  of  light/power  efficiency  of  all  conqxinents  used  for  the  interconnect  i.e., 
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Fur  a  given  pbocodeiector  set,  the  nm-dme  power  fluctuttioa  in  the  conqxMients  should  be  caieftilly  restricted  to  give  a 
satisiactoiy  toleianoe  at  the  detectors.  Non-ioleiable  li^ht/powa  fluctuatitm  can  cause  studc-at  or  liok-<q)eo  fisults. 

*  Signal  Noise  Ratio  (SNR)  and  Bit  Error  Rau  (BER)  [28]  —  Fbr  a  symmetric  binaty  channel  with  Gaussian  noise,  the 
BER  can  be  written  in  terms  of  the  SNR  as 

BER  —  0.S  ( 1  —  erf^SNR/lJl)  )  ,  where  eij(x)  is  the  Error  function  (2) 

When  a  fiee-q>ace  tqttical  interconnect  system  is  modeled  by  Eq.  (2),  the  upper  bound  of  the  SNR  of  the  system  is  given 
by  die  perfi»mance  requirement,  the  Once  the  SNR  fiincdoo  (in  terms  of  the  i^ysical  narameters  of  the  compo* 
nents)  of  a  system  is  given,  the  ndse  margin  is  set  for  die  system.  For  example,  if  BER  <  10'*^  then  SNR  >  14.  Using 
the  SNR  given  by  Eqs.  (B.l),  (B.2),  and  (B.3)  for  spacial  invariant  qidcal  interconnect  system  in  [28],  we  have 

^1 

SNR  =  - - - >  14. 

//,X2Xw,X 
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A  reladon  can  thus  be  derived  between  q  j  and  q^’s  in  terms  of  N^  the  number  of  light  sources,  Wp  the  signal  area,  and 
A/s,  the  area  covered  by  higher  order  phaU  frmn  the  diffiacdve  lenses, 

— (^)  Xq^.  Any  combinadon  of  w,  and  A/s  breaking  this  inequality  will  foil  to  meet  the  system  perfor- 
28XAr,  t  mance  requirement  specified  by  the  B^ 


4.  APPUCATIONS  OF  THE  PROPOSED  FAULT  MODELS 


4.1.  Modulator  Scheme 

In  the  SEED  Based  Optical  Interconnect  [29]  example,  the  SEED  devices  are  used  as  the  output  pads  for  inter-chip/board 
communication.  The  laser  diodes  provide  the  light  source  for  the  input  and  output  of  the  SEED  m^ulator.  And  the  lenses, 
gratings,  polarization  beam  splitters  and  mirrors  are  used  to  adjust  the  optical  path  way.  The  output  light  from  a  SEED  is 
reflected  by  the  polarization  beam  splitter  and  mirror(s)  to  the  Si  photodio^  array,  where  the  Si  CMOS  i^otodiodes  are  cho¬ 
sen  as  the  input  pads  for  each  processing  module.  (Fig.  9  of  [6]  shows  this  interconnect  scheme.}.  SEEDs  and  photodiottes  are 
fabricated  on  the  same  substrate  as  that  of  the  processing  elements.  The  laser  source  provides  the  light  for  the  input  and  output 
of  a  SEED  modulator.  The  output  light  from  a  SEED  gets  reflected  by  the  polarization  beam  splitter  and  mitror(s)  to  the  Si 
photodiode  anray. 

In  the  preceding  setup,  permanent  defects  can  be  found  responsible  to  the  misalignment  problem.  Misalignment  will  result 
in  stuck-at  faults  in  the  modulators,  {rfiotodiodes,  iirqierfect  bem  splitter,  gratings,  and  mitrors.  Faulty  mirrors  and  beam  split¬ 
ters  could  result  in  bridging  and  link-qien  faults.  The  overwrittoi  problem  fot  holograms  can  be  included  if  the  beam  splitting 
pattern  and/or  the  Dammann  grating  are  recorded  as  holograms. 

Unfortunately,  the  faults  are  not  exclusive  to  each  other.  The  worst  case  will  be  a  stocked  modulator  ouqiut  distributing  its 
faulty  signal  to  many  photodiodes  which  causes  several  stuck-at  foults  at  the  input  of  the  next  module(s).  This  is  shown  in  both 
figures  5a  and  Sb  where  possible  stuck-at  and  bridging  faults  can  be  found  at  the  inputs  of  the  processors  in  this  scheme. 

4.2.  Light  Source  Scheme 

In  the  “light  source”  scheme,  the  VLSI-integrated  transducer  [30]  is  used  as  our  example  systeno.  In  this  system,  the  out¬ 
puts  of  the  processing  elements  ate  transformed  into  light  signals  by  the  array  of  micro-lasm  and  semiccMiductive  photodiodes 
are  used  as  the  inputs.  The  micro-lasers  and  photodiodes  can  be  frdiricated  with  die  electrtmic  circuitry  on  the  same  substrate. 
Micro  lenses,  mirrors,  gratings  and/or  beam  splitters  are  fabricated  on  one  side  of  the  glass  substrate. 

The  InP  substrate  with  reflective  surfaces,  mirrors  and  gratings  provide  the  transmission  media  and  interconnect  patterns. 
Figure  4  in  [30]  ^ves  an  abstract  view  of  this  scheme. 

Those  problems  from  micro-lasers,  photodiodes,  etc.  can  induce  stuck-at  faults.  Since  bridging  and  link-open  faults  are 
less  likely  from  the  lasers  and  diodes,  we  can  conclude  that  the  bridging  and  link-open  fruilts  are  from  passive  devices.  In  the 
worst  case,  when  a  micro-laser  and  a  part  of  the  holognqihic  pattern  go  wrong,  a  distributed  stuck  faults  at  the  inputs  of  the 
next  moduIe(s)  results.  This  is  shown  in  figures  6a  and  6b.  However,  the  existence  of  the  bridging  link-open  foults  can  tell  us 
the  light  guitfing  medium  is  fruilty. 

Through  the  above  discussion,  one  can  conclude  that  the  m^iping  of  physical  defects  into  digital  representation  for  the 


ftee  space  optical  inieicoiiiiect  systems  is  very  applicable.  Once  ifae  mapping  mechanism  is  established,  we  can  talk  sdwut 
fault-tolerant  opto-electronic  system  design  and  testing  in  the  distal  world  more  efficiently. 

5.  CONCLUSION  AND  FUTURE  WORK 

We  have  studied  the  physical  defects  in  the  conpooents  used  by  current  inplementadons  of  fitee-space  (pdcal  intercon¬ 
nect.  Extraction  of  the  {riiysical  failure  to  a  higher  level  of  rpresentidoa  has  been  demonstrated  bodi  qualitatively  and  quanti¬ 
tatively.  This  information  abstraction  can  help  system  designer  perform  eariy-stage  trade-off  analysis  and  simulation  using 
automated  design  tools. 

In  the  hybrid  systems  which  we  discussed  in  diis  work,  it  is  possible  to  adept  the  eusting.circuit  testing  techniques  once 
the  mapping  is  done.  As  soon  as  the  pnposed  systems  are  detailed,  people  can  start  thinking  about  the  automated  testability  of 
such  hybrid  systems.  Wbetlm  those  designs  for  testability  are  to  be  implemented  by  optical  conponents  or  electronics  should 
be  left  to  ntore  specific  discussion  for  different  systertu.  FurthernKHe,  a  higher-le^  design-for-testability  and  fault-tolerant 
design  methodology  is  made  possible  once  the  potential  failures  are  identified  for  a  fiee-spaoe  optical  intercminect  system. 

There  are  a  lot  of  topics  this  work  may  induce.  Soft  faults,  time-varyingAransient  faults,  and  fault-tolerant  system-level 
design  are  our  immediate  goal  for  further  investigation.  Moreover,  the  assunption  that  optical-to-electrical  conversion  is  per¬ 
fect  and  we  can  be  relaxed  for  detail  signal  analysis.  A  i»oper  evalumioo  methodology  is  also  needed  to  characterize  system 
performance.  People  should  think  about  the  opti<^-eIectricaI  converters  when  we  get  into  detail  signal  analysis. 
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light  source 


F!g.l  Schematic  representatioo  of  the  “modulator”  scheme. 

Each  processing  element  (PE)  has  photodiode(s)  for  the  receiving  of  optical  signal. 


micro-lasers 


Hg.  2  Schematic  representation  of  "light  source”  scheme. 

Each  processing  element  has  both  the  light  sources  (micro-lasers)  and  pbotodetectors  for  the  data  communication 


Stuck*at  fiiult  model 


Link-open  foult  model 


Bridging  fault  model 


Fig.3  Lo^cal  representation  of  the  permanent  physical  faults  from  free-space  optical  interconnect  components. 


(a)  smaller-sized  micro  minor 


(b)  oversized  mirror 


Hg.4  Processing  defects  for  the  micro  mirrors  can  introduce  two  primary  physical  faults 
fm:  (a)smaller  (b)oversized  cases.  The  latter  is  not  necessarily  the  nearest  nei^bor. 


modulators 


photodiodes 


(a)  worst-case  of  a  stocked  modulator,  01,  and  a  bridging  interconnect  pattern  will  cause  a  fimlty  input  fOT  II. 
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(b)  logical  representation  of  the  combined  fidlure  from  (a) 

Fig.  S  The  worst-case  of  a  combined  failure  in  the  “modulator  scheme”  can  distribute 
the  stuck  fault  to  several  inputs  of  the  succeeding  processing  module. 
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micro-lasers 


photodiodes 


(a)  worst-case  of  a  stocked  micro-laser,  01,  and  a  bridging  interconnect  pattern  will  cause  a  faulty  input  for  II. 
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(b)  logical  representation  of  the  combined  failure  from  (a) 


Hg.6  The  worst-case  of  a  combined  failure  in  the  “light  source  scheme”  can  distribute 
the  stuck  fault  to  several  inputs  of  the  succeeding  processing  module. 
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A  New  Framework  for  Designing 

BuiNn  Test  Muhkhip  Modules 
with  Pipelined  l^t  Strategy 


MumcHip  MODUUS  intercon¬ 
nect  multiple  bare  dies  by  means  of 
a  stack  of  conductive  and  dielectric 
thin  film.  As  the  next  generation  of 
electronic  packaging.  MCMs  offer 
tremendous  advantages  such  as  re¬ 
duced  time  delaj’s  between  chips, 
less  electrical  noise  and  cross  talk, 
simplified  power  distribution,  and 
small  size.  However,  large  I/O  lead 
counts  and  high-density  intercon¬ 
nections  decrease  testing  through¬ 
put  and  accelerate  testing  cost. 
Traditional  hierarchical  testing, 
wliich  involves  testing  chips  indi¬ 
vidually  before  assembly  and  then 
testing  the  assembled  module  to 
avoid  any  en-ors  introduced  during 
packaging,  includes  bed-of-nails  fix¬ 
tures  and  hand-held  diagnostic  probes. 
Tliese  methods  become  impractical 
and  costly  with  new  technologies  such 
.as  MCMs  and  surface-mounted  devices. 
Incomplete  or  unavailable  test  vectors 
from  cliip  manufacturers  and  the  inter¬ 
nal  module's  low  observability  contrib¬ 
ute  to  these  problems.  Built-in  test, 
where  the  circuit  or  s>’stem  under  test 
includes  a  small  circuit  for  testing,  rep- 
resenls  a  new  approach  in  testing.  Exam- 
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The  authors  introduce  a  novel 
test  strategy,  tfie  loop  testing 
architecture,  to  reduce  aliasing 
prcbobilily  and  testing  time  for 
multichip  modules.  Comparisons  to 
other  approaches  confirm  that 
LTA  provides  a  neW  framework 
for  designing  effective 
testable  ^tem$. 


pies  of  well-known  BIT  techniques  in¬ 
clude  the  scan  design  and  built-in  logic 
block  observer  methods.’-* 

Scan-design  methods  involve  discon¬ 
necting  the  memory  elements  and/or 
the  flip-flops  from  the  combinational 
logics.  The  overwhelming  number  of 
test  outputs  generated  by  a  relatively 
large  circuit  makes  the  scan  method 
cumbersome.  Signature  analysis,  a  pop¬ 
ular  data-compaction  solution,  uses  a 
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linear-feedback  shift  regfeter  to  re¬ 
ceive  and  modify  output  data.  After 
a  long  sequence  of  lest  patterns,  the 
residue  (or  signature)  in  the  shift 
register  of  a  faulty  circuit  differs 
from  the  signature  of  a  good  circuit. 
Therefore,  combining  the  bound¬ 
ary-scan  and  built-in  self-test  tech¬ 
niques  provides  an  alternate 
method  for  testing  complex  circuits 
at  the  board  level'  more  efficiently. 

BILBO  combines  the  basic  fea¬ 
tures  of  scan  design  with  those  of 
signature  analysis.*  The  shift  regis¬ 
ters  form  feedback  paths  by  XOR- 
ing  some  outputs  from  the  flip-flops 
and  connecting  back  to  some  of 
the  flip-flop  inputs.  A  given  width  of 
BILBO  that  implements  a  primitive 
polynomial  has  the  corresponding  XOR 
patterns.  One  combination  of  the  con¬ 
trol  signals  configures  BILBO  into  a  mul¬ 
tiple-input  shift  register  for  compacting 
circuit  responses  when  the  shift  roisters 
contain  extra  control  ports.  Today's 
computer  designs  contain  internal  VLSI 
bus  paths  with  widths  of  16  or  even  32 
bits,  while  the  BILBO  design  retains  a 
bandwidth  of  8  bits.  Therefore,  we  need 
to  redesign  the  BILBO  to  accommodate 

lEEl  DEE  ION  O  nST  OF  COMFUTERS 


the  new  wider  bus  paths.  Bhavsar  pro¬ 
posed  a  family  of  concatenating  polydi- 
videis  with  primitive  characteristic 
polynomials  to  resolve  the  unextend- 
able  BILBO  problem  for  packaged 
chips.^ 

To  minimize  hardware  overhead  and 
design  time  while  maintaining  certain 
stale  and  fault  coverages,  we  recom¬ 
mend  a  bytewise  cascadable  built-in 
tester  macro  cell  with  an  optimum  primi¬ 
tive  characteristic  polynomial.  This  keeps 
the  CBIT  cell  in  a  design  library,  allowing 
circuit  and  ^em  designers  to  easily  con¬ 
struct  the  necessary  feedback  path  for 
their  BIST  circuitry.  Previous  work  on  cir¬ 
cular  self-testing  paths  (CSTP)^  accom¬ 
plished  cascadability,  however,  by 
simply  connecting  registers  in  a  circuit  to 
form  a  closed  loop  with  a  feedback  poly¬ 
nomial  of  I .  With  a  nonprimitive  feed¬ 
back  characteristic  polynomial,  the  CSTP 
approach  is  a  special  application  of  the 
CBIT.  The  performance  of  CSTP  suffers  as 
a  result  of  its  nonprimitive  polynomial 
and  requires  a  sufficiently  long  testing 
time  when  the  aliasing  probability  ap¬ 
proximates  the  a^mptotic  value  2~^, 
where  N  is  the  input  width  of  the  circuit 
under  test.^ 

To  further  improve  testing  time,  we 
propose  a  novel  approach  based  on 
CBITs  that  allows  concurrent  MCM  test¬ 
ing.  We  refer  to  this  strategy  as  the  loop 
testing  architecture.  LTA  uses  CBITs  in  a 
pipe  interwoven  with  high  I/Ocount 
chips  on  MCMs.  Simulation  results  show 
that  this  guarantees  high  test  coverage 
with  the  use  of  maximum-length  pseu¬ 
dorandom  sequences  for  test  pattern 
generation.  The  aliasing  probability 
compares  favorably  to  that  provided  by 
a  twofold  multi-input  linear-feedback 
shift  register^  with  only  a  fraction  of  the 
area  necessary. 

MCMs  require  significant  exhaustive 
testing.  The  original  lest  vectors  for  chips 
with  high  I/O  counts  from  different  man¬ 
ufacturers  may  not  be  available  for  the 
functional  testing  of  the  assembled 
module.^  In  this  case,  parallel-pipelined 


exhaustive  testing  u»ng  LTA  becomes 
imperative  for  the  MCM  designers  to 
achieve  better  fault  coverage  more  effi¬ 
ciently  than  boundary  scan.  Arrays  of 
CBITs  provided  to  the  MCM,  either  in  the 
form  of  a  small  chip  on  the  same  sub¬ 
strate  or  off  the  MCM  test  circuitry,  allow 
chips  without  BIST  circuitry  to  use  LTA. 
Chips  with  existing  on<hip  BIST  struc¬ 
tures  easily  support  LTA. 

Cascadable  builhin  testing 
structure 

The  goal  of  our  CBIT  design  is  to  pro¬ 
vide  a  macro  cell  in  the  design  library 
that  expedites  the  BIT  design  process. 
We  cascade  CBIT  cells  to  form  a  CBIT 
suite,  using  multiplexers  and  XORs 
placed  in  strategic  locations  to  construct 
different  feedback  paths.  This  generates 
primitive  polynomials  in  a  multiple4>yte 
configuration.  A  CBIT  suite,  with  feed¬ 
back  connections  that  represent  a  prim¬ 
itive  polynomial,  acts  as  a  maxi¬ 
mum-length  PRS  generator.^  Not  only 
does  CBIT  perform  test  pattern  genera¬ 
tion  and  signature  analysis,  it  also  per¬ 
mits  cascadability  to  generate  a 
maximal  length  PRS.  In  performing 
signature  analysis,  a  primitive  character¬ 
istic  polynomial  gives  a  quicker  conver¬ 
gence  of  the  smaller  asymptotic-aliasing 
probability  for  a  given  lest  length.^ 

The  CBIT  design.  A  modified  Bbit 
BILBO  forms  a  CBIT  cell.  It  consists  of 
three  control  signal^  (C^f,  C^  and  CJ, 
eight  parallel  inputs  (Dbus),  eight  paral¬ 
lel  outputs  (Qbus),  an  LFSR  consisting  of 
eight  flipflops,  and  XORs  that  provide  the 
feedback  path  of  the  LFSR.  Two  serial 
data  ports,  Scanjn  and  Scan.out,  form 
the  scan  path.  Finally,  Feedbarkjn  and 
Feedback.out  provide  the  cascading 
links arnong CBITs.  Figure  la  (next  page) 
shows  the  8bit  CBIT  cell,  while  Figure  lb 
represents  a  I&bit  CBIT  suite  conrigured 
from  two  CBIT  cells.® 

Use  of  the  feedback  pattern  and  the 
generating  polynomial  guarantees  max¬ 
imum-length  PRS  generation  in  both  the 
8- and  l&bit  cases.  Notice  that  in  the  IG- 


Fr^uently  adionyms 

BIIBO  ^ilt-in  logic-block  observer 
BIST  Built-in  sjf  lest 
BIT  Built-in  test 
CBIT  Cascadable  built-in  tester 
CSTP  Circular  self-testing  path 
CUT  Circuit  under  test 
GLFSR  Generali^  linear-feedback 
shift  register 

LFSR  Linear-feedback  shift  register 
LTA  Loop  testing  architecture 
MISR  Multiple-input  shift  register 
PRS  Pseudorandom  s^uence 
PSA  •  Parallel  signature  analyzer 
SL/T ,  System  uncW  ted 
TPG  Test  pattern  generator/ 
generation 


bit  case,  the  feedback  path  for  the  least 
significant  CBIT  suite  differs  from  the 
path  of  the  most  significant  CBIT  suite 
(Figure  I  b).  To  guarantee  the  maximum 
randomness  and  quick  conveigence  to 
the  as>'mptotic  value  of  the  aliasing 
probability,  the  generating  polynomial 
for  the  16-bit  CBIT  must  be  prime.®  In 
general,  cascading  CBITs  make  extend¬ 
ed-length  MISRs  fit  the  increasing  size  of 
the  data  buses  without  redesigning  the 
detail  of  the  BIST  circuitry.  This  speeds 
up  the  design  modification  cycle  by 
making  the  original  designs  more 
testable. 

Operation  inodes  provided  by 
CBIT.  CBITs  provide  three  modes  of  oj> 
eration  (Figure  2):  parallel-register,  scan 
path,  and  MISR.  During  nonnal  opera¬ 
tion,  the  parallel-register  mode  remains 
active.  CBITs  form  pipelined  parallel  reg¬ 
isters  in  the  data  path.  During  initializa¬ 
tion  and  signature  readout,  the 
scan-path  mode  becomes  active.  Tire 
Scanjn  port  shifts  in  nonzero  seeds  and 
the  Scan_out  port  reads  out  signatures. 
In  addition,  a  scan  path  can  form 
through  the  pipe  to  read  out  signatures 
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Table  J.  Control  signals  and  the  corre¬ 
sponding  CBIT operation  mode  settings. 


Control  signals 
|C.C,CJ 

Configuration 

•  (11 1) 

Parallel  register  mode 

(01-1 

Scon  path  mode 

(1--) 

MISR  mode 

(101) 

Most  significant  byte 
Ibrcascoding 

(110) 

Least  significont  byte 
for  cascading 

(100) 

Single-byte  MISR 

in  the  intermediate  stages.  Configuration 
of  the  CBfTs  for  testing  occurs  in  the 
MISR  mode  that  concurrently  performs 
pseudoexhaustive  test  pattern  genera¬ 
tion  for  the  succeeding  CUT  and  output 
signature  analysis  for  the  previous  CUT. 
The  combinations  of  the  control  signals 
C,,  C,,  and  C*  provide  three  major  oper¬ 
ations  as  summarized  in  Table  I.  As 
shown  in  the  last  three  rows,  the  combi¬ 
nations  of  C^,  and  C,  enable  CBIT 
cascading. 

Pipelining  for  self-lesling 

The  horizontal  extension  of  the  CBfTs 
accommodates  laige  I/O  MCM  testing. 
Further  reduction  in  testing  time  results 
when  several  functional  blocks  in  an 
MCM  form  a  pipe  to  test  blocks  concur¬ 
rently.  (We  refer  the  functional  blocks  to 
those  CUTS  in  a  SUT  and  modules  to 
CUTOUT  with  BIT  circuits.)  Each  pipe 
consists  of  one  zeroslage  CBIT  suite  and 
subsequent  stages  of  block  and  CBIT  set. 

Clusters  of  functional  blocks  possess¬ 
ing  similar  numbers  of  inputs/outputs 
form  a  pipe.  Pipes  constructed  accord¬ 
ing  to  their  functionality  and  data  width 
improve  efficiency.  We  then  construct 
CBIT  suites  to  match  the  data  width  of 
each  pipe.  For  CUTs  with  very  limited 
outputs  (for  example,  encoders),  we 
can  cluster  a  greater  number  of  CUTs  for 
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the  CBIT  suites  to  analyze  at  each  stage. 
Alternatively,  the  partition/segmenta¬ 
tion  process  discus^  by  Srinivasan  et 
al.'**  constructs  several  Sorter  or  nar* 
rower  pipes.  Figure  3a  shows  construc¬ 
tion  of  pipes  for  a  data  path  in  the  SUT, 
while  Rgure  3b  shows  the  pipe  con¬ 
struction  for  a  control  path  (which  usu¬ 
ally  has  nonuniform  I/O  bit  width  or 
branched  signal  flows).  The  require¬ 
ments  on  the  state  and  fault  coverage 
and  the  aliasing  probability  determine 
the  proper  length  of  any  given  pipe. 

Our  previous  work^  shows  prelimi- 
naiy  results.  Following  formation  of  the 
pipes,  rearrangement  of  the  number  of 
stages  in  each  pipe  occurs  so  that  most 
of  the  pipes  flni^  self-testing  simulta¬ 
neously.  Normally  existing  data  paths 
with  pipelining  form  naturally  self-test¬ 
ing  pipes.  When  the  pipe  becomes  too 
long  and  requires  decomposition,  only 
the  zerostage  CBIT  suite  adds  to  the  sec¬ 
ond  pipe.  Creating  all  pipes  under  this 
guideline  after  the  rearrangement  phase 
gives  the  maximum  parallelism. 

For  high  fan-in  CUTs,  decomposing 
the  original  network  into  segments  with 
fewer  fan-ins"*  reduces  computational 
effort.  The  controllability,  detectability, 
and  observability  measures  of  a  seg¬ 
mented  circuit  remain  the  same  as  for 
the  uns^mented  CUT. ' '  Adopting  algo¬ 
rithms  proposed  by  Yeh  et  al.'^  allows 
grouping  of  segments  into  clusters.  Of¬ 
ten,  these  clusters  identify  natural  pipes. 

LTA.  Because  the  cumulative  test  re¬ 
sults  degenerate  over  multiple  stages, 
we  need  higher  test  coverage  and  lower 
aliasing  probability  in  the  pipelined 
MISR  operation.  Further  cascading  of 
CBITs  in  neighboring  stages  via  the 
FeedbackJn/Feedback_out  lines  in¬ 
creases  the  length  of  tire  CBIT  suite.  Con¬ 
structing  the  scan  lines  to  ease  serial 
scanning  of  signatures  from  all  CBITs  af¬ 
ter  the  test  session  results  in  LTA. 

Rgure  3a  also  shows  the  construction 
of  LTA  with  Feedbackjn  selected  for  the 
CBIT  that  perfomts  the  most  significant 
unit  analy^and  Feedback_out  selected 


for  the  least  significant  CBIT.  Daisy-chain¬ 
ing  the  Scanjn  and  Scan.out  ports  of  the 
CBITs  at  each  stage  gives  a  scan  path  for 
the  initialization  and  scans  out  the  final 
signature  of  each  CBIT.  The  last  single 
CBIT  suite  of  a  pipe  connects  to  the  zero- 
stage  CBIT.  Thus  for  each  pipe,  we  have 
double-length  CBIT  suites  for  signature 
analysis  resulting  in  smaller  aliasing  prob¬ 
ability  for  the  whole  pipe.  Figure  4  shows 
the  equivalent  data  flow  when  the  CBITs 
are  paired  to  do  a  double-length  signa¬ 
ture  analysis.  The  grayed  functional 
blocks  (FI ,  F2,  and  F4)  show  the  paired 
testing  flow  when  we  cascade  two  CBITs 
in  LTA.  Neighboring  CBITs  of  arbitrary 
length  created  with  LTA  result  in  a  desir¬ 
able  aliasing  probability. 

Evolucrtion  of  the  pipelining  test 

We  configure  all  of  the  CBITs  as 
MISRs  for  testing.  They  generate  test  pat¬ 
terns  and  perform  signature  analysis. 
Two  assumptions  justify  this  decision:^ 

1.  Tire  result  of  the  input  (seed)  and 
the  current  state  under  the  opera¬ 
tion  governed  by  the  characteristic 


1  i 


Figure  4.  Data  path  of  rr  <e  m-sloge  pipe¬ 
lined  extended  CBIT  testing. 


polynomial  of  the  LFSR/MISR  si  rail 
not  be  0  for  any  state  of  the  PRS. 

2.  Multiple  inputs  (seeds)  to  the  LFSR/ 
MISR  still  traverse  all  the  states  of  the 
PRS;  we  do  not  consider  the  degen¬ 
eration  or  missing  of  some  slates  in 


Figure  3.  LTA  pipe  for  cuts  with:  homogeneous  data  widdi  (primarily  data  paths}  (a);  het¬ 
erogenous  data  widdi  (primarily  control  paths)  (b). 
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(he  PRS  because  of  special  combi¬ 
nations  or  sequences  of  the  seeds. 

These  assumptions  are  validated  by 
Kim  e(  al.'®  who  discass  the  existence  of 
properties  pertaining  to  the  randomness 
of  the  patterns  generated  by  a  MISR  that 
exist  even  if  the  inputs  are  not  equally 
probable. 

To  justify  the  pipelined  LTA  ap¬ 
proach,  we  must  pro\’e  two  things  re¬ 
garding  the  dual  use  of  the  intermediate 
CBITs.  First,  we  must  show  that  these 
CBITs  perform  effectively  as  TPGs  with 
patterns  generating  maximum-length 
PRS  at  each  stage.  The  pseudorandom 
property  of  the  generating  polynomial  of 
the  CBIT  in  the  MISR  mode  supports  this. 
(The  percentage  of  the  corresponding 
maximum-length  PRS  indicates  the  per¬ 
formance  le\'el.)  Second,  we  must  show 
that  the  limited  output  patterns  of  these 
functional  blocks  do  not  disturb  the  ran¬ 
domness  of  the  signature,  where  the 
aliasing  probability  remains  acceptably 
small  after  a  numter  of  stages. 

Propeilies  of  the  pseudorandom 
test  pattern  generation  htmi  CBIT. 
The  construction  of  CBITs  by  LFSRs  with 
primitive  and  irreducible  characteristic 
potynomials  gives  them  the  following 
major  properties:^ 

■  Every  element  (or  state)  a  in  the 
PRS  generated  by  the  LFSR  has  a 
complementary  element  (or  state) 
a  in  the  same  sequence  such  that 
04  0=0  (AAbit-wide  O's),  where  '4’ 
represents  the  operation  on  the 
complementary  elements  of  the 
PRS  as  defined  by  the  characteristic 
polynomial  of  the  LFSR. 

■  For  the  cj'clic  PRS.  more  than  one 
input  seed  will  either  decompose 
the  original  maximum-length  c>’cle 
to  several  subcycles  or  merge  at 
least  two  subcycles  leather. 

■  The  total  numberof  (distinct)  states 
of  all  the  subcycles  (if  decomposed 
by  multiple  s^s)  are  2^-1  for  an 


Atstage  LFSR.  In  this  manner,  we  ex¬ 
clude  the  0  state  from  the  Pf^  and 
it  formsatrivial  cycle  (0 -4  0)  for  the 
LFSR. 

Cascaded  CBITs  generate  test  pat¬ 
terns  for  different  functional  blocks  in  a 
pipe,  giving  rise  to  the  following  obser¬ 
vations: 

■  Each  CLfT  with  ^it-wide  input  bus¬ 
es  needs  2'^-!  different  test  patterns 
to  Finish  the  exhaustive  self-testing. 

■  To  exhaustively  test  the  paired 
CUTs  (each  with  no  more  than  A/ 
inputs),  one  pair  of  CUTs  should 
generate  2^~1  test  patterns. 

For  example,  given  an  8-input  CUT, 
we  need  2^-1  te^  patterns.  Assuming 
that  there  is  no  correlation  beh\’een  the 
two  neighboring  CUTs  for  one  pair  of 
cascaded  CBIT  suites,  we  need  one 
maximum-length  cycle  of  the  8blt-wide 
PRS  to  fully  test  one  8bit  CUT.  However, 
because  of  the  equally  distributed  ones 
in  the  16bit-wide  PRS,^  we  should  fully 
test  two  8bit  CUTs  before  the  extended 
PRS  reaches  its  maximum-length  period 
(which  is  2'®-l ).  The  actual  number  of 
the  test  patterns  needed  to  fully  test  m 
CUTs  simultaneously  using  m  cascaded 
CBITs  depends  on  the  characteristic 
polynomial  of  the  extended  CBITs  and 
the  input  seeds.  But  In  general,  we  have 
the  following  relation: 

(2''-l)sLs(2^-l)  (1) 

where  L  represents  the  test  length  need¬ 
ed  to  test  m  CUTs  exhaustive!)'  given  m 
cascaded  CBIT  suites.  Therefore,  CBITs 
perform  effectively  as  TPGs  when  we 
choose  an  appropriate  lest  length  ac¬ 
cording  to  Equation  1. 

Aliasing  probability  for  single- 
stage  MISRs.  A  special  case  of  the  gen¬ 
eralized  LFSR®  occurs  with  CBITs  in  the 
MISR  mode.  A  generalized  m-slage,  M 
input  signature  analyzer  with  linear 


feedback  patterns  built  over  the  Galois 
Field  GF(2^.  GLFSR(/n  =  I .  AO  becomes 
known  as  an  AAinput  MISR  when  m  « I. 
Therefore,  when  the  CBITs  have  charac¬ 
teristic  polynomials  designed  to  be 
prime,  maximum-length  test  patterns  re¬ 
sult,  and  the  asymptotic-aliasng  proba¬ 
bility  converges  quickly. 

Theorem  2  by  Karpov^  et  al.'^  pro¬ 
vides  a  general  formula  for  calculating 
the  aliasing  probability  for  a  onestage, 
AAbit-wideMISR: 

^al(P0'Pl’P2-P3 . 

=  2"^ 

k  ;»|  y=0  ^ 

where  L  is  the  test  length  and  po,  pi, 
P2,...,pf.i,iTe  the  Walsh  transforms  of 
the  error  probabilities  from  an  Afbit-oul- 
put  CUT.  Error  detection  does  not  occur 
if  p,=0,  but  always  results  when  p,-=  1. 

^me  closed  forms  of  Pd  exist  with 
additional  conditions.®  Two  of  these 
closed-form  Pafs  result  from  setting  the 
bit-error  transition  probability  p  to  0.5: 

■  When  the  lest  length  L  Ls  m(2''-I) 
(where  m  S  1  is  an  integer),  the 
aliasing  probability  Pgi  is  gwen  by 

where  m  >  I  is  an  integer  for  the  in¬ 
dependent  error  model. 

■  When  the  number  of  test  patterns  is 
an  arbitrary  positive  integer  L  and 
the  probability  of  an  output  bit  be¬ 
ing  wrong  is  0.5,  Po/ is 


for  the  2A^aty  symmelric<hannel  error 
model.  Notice  that  for  both  cases  when 
2^  »  1 ,  Po/  converges  to  an  a^’mptotic 
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value  2~'\  [When  the  lest  length  Z.  is  less 
titan  one  maximum  length  for  the  M>it* 
wide  MiSR  for  the  independent  error 
model  (for  example.  2^-1),  we  use 
Equation  2  to  calculate  the  exact  alias¬ 
ing  probability  of  the  MISR.] 

Aliasing  probability  in  the  pipeiin- 
ing  sclicinc.  In  tlie  multistage  pipelining 
MCM  testing  scheme,  we  calculate  the 
aliasing  probability  for  the  k\h  stage  as 

/^(aliasing  probability  at  l^lh  stag^ 

_  j  r nonaliasing  probability 
[.over*  stages  J 

+  — ‘2] 

Let  2'''»  1  (^neraily  true  for  all  CBIT 
suites).  We  can  then  approximate  Pai  of 
the  AAinput  MISRs  as  2'“''  for  all  the  val¬ 
ues  of  m  or  L  in  Equations  2  and  3. 
then  simplifies  to 


By  ignoring  the  contribution  from  the 
terms  smaller  than  2r‘\  the  aliasing  prob¬ 
ability  for  the  *th  stage  pipelined  CBIT 
convenes  to 


P*-*x2-«  (5) 

Equation  5  gives  the  asymptotic  value 
for  lx)lh  the  ^mmetric-channel  error 
model  with  any  test  length  and  the  inde¬ 
pendent  error  model  with  a  test  length 
of  at  least  one  maximum  length.  When 
the  number  of  stages  k  is  much  smaller 
than  the  maximum  length  of  the  PRS 
generated  by  the  MISR  (for  example, 
2^- 1 ),  the  aliasing  probability  at  the  *Hh 
stage  MISR  in  the  pipelining  scheme  is  of 


Scanjn  | . TPG--1— .  Module  0 

jsmi 


^CBITel- 


Module  1 


Module  2 


jsffisr 


Scan_out-« 


. GBITs  - 


Module  3 


Scanjn 


Module  1 


CBils 


r 


Scan_out  -« 
(b) 


^  CBOi I 

IMJRT 

[55^  ] 

jCBtTsi 


Module  2 


Modules 


Figura  5.  ITA  capobilily  hr  haling:  module  funcHonalily  {a);  module  functionality  and  in- 
lerconnections  fid- 


the  order  of  magnitude  O(2~'0- Tlie  sim¬ 
ulation  result  in  Lin  and  Kaseff,’’  where 
the  aliasing  frequency  and  probability 
both  stay  constant  as  0(2“'®)  over  six 
stages  in  the  pipelining  path,  also  vali¬ 
dates  this  result. 

Thus,  the  randomness  of  the  signa¬ 
ture  is  preserved  in  the  case  of  a  limited 
number  of  multiple  inputs.  Also,  the 
aliasing  probability  is  sufficiently  small 
given  (hat  (he  number  of  stages*  is  small 
compared  with  the  maximum  length  of 
the  PRS. 

Other  LTA  applications 

LTA  has  several  additional  applica¬ 
tions.  It  can  test  interconnections,  and  is 
effective  in  locating  faults  in  intermedi¬ 
ate  CBfTs. 

CaiMibilily  for  testing  interconnec¬ 
tions.  We  view  interconneclions  among 
the  MCMs  as  simplified  ClTTs  with  com¬ 
patible  data  paths,  and  the  whole  ^em 
as  many  CUTs  (including  the  intercon¬ 
nections)  requiring  testir^  under  tlie  LTA 
scheme.  The  pipelining  scheme  tests  in¬ 
terconnections  among  MCMs  by  integrat¬ 
ing  two  sets  of  CBITs  next  to  the  I/O  pins 
in  each  module.  The  first  CBIT  set  oper¬ 


ates  in  (lie  MISR  mode,  validates  ihc  inpul 
before  the  signals  reach  the  internal  log¬ 
ics.  and  serves  as  lire  TPG  for  the  internal 
Ic^ic  blocks.  Tlie  second  set  of  CBITs  also 
operates  in  the  MISR  mode,  examining 
the  output  from  the  internal  logic  circuit¬ 
ry,  and  serves  as  the  TPG  for  the  intercoiv 
nection  to  tlie  next  module. 

Figure  5a  sliows  one  CBIT  suite 
placed  at  the  primary  outputs  of  each 
CUT  with  the  zero  stage  CBIT  suite  add¬ 
ed  to  generate  the  pseudorandom  test 
pattern  for  the  fiist  CUT.  Cascading 
neighboring  CBITs  (as  shown  in  Figure 
3a)  tests  the  modular  functionality  of 
each  CUT.  However,  this  implementa¬ 
tion  cannot  test  the  interconnections 
between  the  CUTs.  Figure  5b  shows  an 
extra  CBfT suite  inserted  near  the  prima¬ 
ry  inputs  of  each  CUT.  Therefore,  we  al- 
wa>'S  have  a  CBIT  suite  testing  either  a 
functional  block  or  an  interconnection 
pattern  between  two  CUTs. 

This  implementation  provides  a  gen¬ 
eral  approach  that  can  lest  fault  patterns 
in  all  permutations,  including  multiple 
stuck-at  faults,  bridging  or  coupling,  and 
pattem-sensitive  faults.  Tlie  AAbit-wide 
interconnection  network  often  realizes 
fewer  than  2'^-!  different  patterns  for 
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implementing  signal  links  between  any 
two  CUTS.  However,  our  AMjit-wide  CBIT 
.suite  can  generate  i  different  test  pat¬ 
terns  to  exercise  the  AMjit-wide  intercon¬ 
nection  exhaustively. 

In  scheduling  the  testing  for  the  inter¬ 
connections,  extra  modes  become  un- 
necessaiy.  In  addition,  timing  conflicts 
do  not  exist  because  we  use  two  sets  of 
CBITs  near  the  I/O  pins  that  transform 
the  Interconnections  into  another  tjpe 
of  CUT  directly.  This  pipelining  scheme 
allows  for  concunent  testing  of  both  the 
modules  and  interconnections.  Adapt¬ 
ing  LTA  lor  interconnection  testing  by 
inserting  one  extra  CBIT  suite  near  the 
input  ports  (making  the  interconnec¬ 
tions  observable)  results  in  area  over¬ 
head.  In  contrast,  pipeline  testing  for  the 
modules  requires  only  one  CBIT  suite  at 
the  outputs  of  each  module.  Testing  the 
interconnections  and  the  module  func¬ 
tionality  concurrently  requires  an  extra 
CBIT  suite  but  saves  separate  modes  for 
reconfiguring  the  SUT  to  test  the  inter¬ 
connections.  Therefore,  simultaneous 
testing  yields  significant  time  savings 
with  only  a  small  area  penalty.  Further¬ 
more,  the  placement  of  CBIT  sets  holds 
when  we  move  the  I/O  ports  to  the  cen¬ 
ter  of  the  modules. 

Fault  location  in  intermediate 
CBITs.  CUT  Signatures  read  out  when 
we  configure  CBITs  in  the  scan-path 
mode.  Generally,  a  wrong  final  signature 
indicates  that  faults  exist  in  the  test  pipe. 
I  iowever,  there  exists  the  possibility  of 
faults  from  different  CUTs  in  a  pipe  can¬ 
celing  each  other  and  producing  a  good 
signature  at  tlie  last  stage.  Therefore,  we 
need  to  know  the  exact  lest  length  ap¬ 
plied  to  each  CBIT  suite.  This  results  in 
observable  signatures  at  the  intermedi¬ 
ate  stages  and  facilitates  fault  location. 
By  locating  faults  in  specific  CUTs.  we 
achieve  better  fault  diagnosis. 

Examples 

We  developed  two  experiments  to 
demonstrate  the  effectiveness  of  the  pro¬ 


posed  LTA.  The  fiist  involves  testing  a 
homc^neous  processor  environment 
consisting  of  SN74LSI81  ALUs.  The  sec¬ 
ond  encompasses  a  heterogeneous 
MCM  system  with  several  types  of  com¬ 
ponents.  We  transformed  both  of  these 
^ems  into  test  pipes. 

Six-stage  ALU  pipes 

Six  ALUs  (SN74LSI81)  form  a  pipe 
with  I&bit  CBIT  suites  inserted  between 
the  ALUs.  Each  l&bit  CBIT  suite  acts  as 
a  TPG  for  the  14-bit  input  ALU.  The  8bit 
output  of  the  ALU  feeds  into  a  CBIT  suite 
configured  for  signature  analysis.  In  this 
experiment,  we  develop  two  pipes 
based  on  LTA:  one  implements  a  primi¬ 
tive  characteristic  polynomial  between 
the  looped  CBIT  pairs,  while  the  other 
directly  connects  the  feedback  lines 
without  changing  the  feedback  pattern 
of  each  CBIT  suite.  We  also  reconstruct 
the  straight  pipe  from  our  previous  ex¬ 
periments^  to  provide  a  baseline  com¬ 
parison. 

Randomness  of  the  TPG.  We  mea¬ 
sure  the  randomness  of  the  TPG  process 
at  each  stage  of  the  three  pipes  for  vari¬ 
ous  test  lengths.  This  allows  us  to  evalu¬ 
ate  the  effectiveness  of  the  CBITs  as 
TPGs  while  operating  in  the  MISR  mode 
as  well  as  the  impact  of  pipe  length  on 
the  test  pattern  generation.  For  an  /V4}it- 
wide  CBIT  suite,  the  randomness  mea¬ 
sure  is  100%  if  2^*  test  patterns  are 
generated.  Figure  6  shows  the  random¬ 
ness  measure  for  each  stage  of  the  three 
different  pipes.  In  all  three  configura¬ 
tions.  the  randomness  levels  off  after  the 
first  stage,  indicating  that  the  length  of 
the  pipe  does  not  affect  the  random  TPG 
process. 

Our  previous  experiments  showed 
that  the  required  test  length  L  for  the  two 
N-hit  CUTs  under  LTA  testing  (using 
Equation  1)  is  smaller  than  2^-1.  The 
simulation  result  in  Figure  6  validates 
this  observation  by  showing  that  we  can 
exhaustively  test  all  ALUs  in  each  cas¬ 
cading  stage  of  the  pipe  when  L  is  about 


four  times  the  maximum  length  of  the  /V^ 
bit-wide  CBIT  suite.  That  is.  instead  of 
2®-l  (or  even  2®-l)  test  patterns  for  the 
two  ALUs,  we  require  only  4  x  2'*  lest 
patterns  to  give  a  100%  randomness  for 
the  two  ALUs  at  each  cascading  stage. 

The  cascaded  CBIT  suite  that  imple¬ 
ments  LTA  outperforms  the  strai^t  pipe 
by  producing  the  best  random  patterns. 
As  we  increase  the  test  length,  the  CBIT 
suites  eventually  generate  a  16bit-wide 
maximum-length  PRS.  This  validates  our 
earlier  assumption  that  multiple  inpute  to 
the  PRS  generators  still  produce  the  max- 
imunvlength  PRS.^  Regardless  of  how  the 
8bit-wide  outputs  are  connected  to  the 
CBIT  suite  (at  the  higher,  lower,  or  even 
the  middle  byle),  after  a  sufficiently  long 
test  length  (four  times  of  the  maximum 
length)  100%  randomness  of  the  1  Gbit- 
wide  is  still  possible. 

Aliasing  probability  of  signature 
analysis.  We  introduce  faults  by  requir¬ 
ing  a  single  sluck-at-0  fault  at  the  first 
stage  ALU  in  each  pipe.  After  a  specified 
number  of  test  patterns,  we  compare  the 
ALU  signatures  with  known  good  signa¬ 
tures.  Aliasing  occurs  when  a  faulty’  pipe 
produces  the  same  signature  as  a  fault- 
free  pipe. 

We  compare  the  aliasing  probability 
at  the  last  stage  of  the  three  pipes  which 
have  a  stuck-at-0  fault  at  the  least  signifi¬ 
cant  bit  of  the  first  stage  ALU  output.  As 
shown  in  Figure  7  (ps^  46),  the  aliasing 
probability  of  a  16-bit  CBIT  suite  ap¬ 
proaches  0(2*'®)  as  the  test  length  in¬ 
creases  in  all  three  pipes.  The  straight 
pipe  lends  to  receive  aliases  early  for 
shorter  test  lengths.  In  contrast,  a  CBIT 
suite  implementing  LTA  does  not  exhib¬ 
it  aliasing  effects  until  a  sufficiently  long 
lest  time  has  elapsed.  In  this  case,  alias¬ 
ing  occurs  at  the  sixth  stage  after  100 
tests  in  the  primitive  LTA  pipe.  With  test 
lengths  smaller  than  the  length  of  one 
maximum-length  PRS,  the  aliaang  prob¬ 
ability  becomes  more  pronounced.  Fig¬ 
ure  7b  shows  that  aliasing  occurs  at  the 
later  stages  before  it  shows  up  in  the  ear- 
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lier  stages.  For  the  nist  stage,  aliasing  oc- 
cuts  after  more  than  1,000  tests.  Howev¬ 
er.  the  sixth  stage  shows  aliasing  after 
less  than  10  tests.  This  intpiies  that  a 
waniHip  period  reduces  the  aliasing 
probability  at  each  stage  for  small  test 
lengths  of  the  pipelined  LTA. 

In  general,  CBTF  suites  implementing 
LTA  exhibit  smaller  aliasing  probabili¬ 
ties  than  straight  pipe  CBIT suites.  When 
we  use  only  one  CBIT  suite  in  the  last 
stage  for  comparison,  all  aliasing  proba¬ 
bilities  stay  at  0(2~'^)  (see  Figure  7c).  If 
we  read  the  contents  of  the  two  CBIT 
suites  as  the  complete  signature,  the 
aliasing  probabilities  for  both  pipes  im¬ 
plementing  LTA  become  0(2*^  in  our 
single  stuck  fault  simulation,  a  n^ligible 
value  compared  to  0(2~'^.  This  results 
from  the  CBIT  suite's  extended  width  of 

(2x16  in  this  case).  Note  that  in  Fig¬ 
ures  7a  and  7c  the  cascaded  CBITs  with 
nonprimilive  characteristic  polynomials 
give  the  same  a^mptotic  value  of  the 
aliasing  probability  as  that  of  the  primi¬ 
tive  feedback  polynomials  discussed  by 
Damianietal.* 

Area  overhead  and  testing  time. 
The  area  overhead  for  implementing 
LTA  consists  of  the  extra  wiring  required 
to  cascade  the  CBIT  suites  with  addition¬ 
al  XORs  (to  implement  the  primitive 
generating  polynomial).  As  mentioned 
earlier,  constructing  the  scan  path  with 
cascaded  CBITs  eliminates  the  need  for 
extra  circuitry.  The  testing  time  for  the 
LTA  pipe  is  the  same  as  that  of  the 
strai^t  pipe.  However,  with  some  addi¬ 
tional  wiring  and  extended  signatures, 
LTA  pipes  provide  extra  observability  at 
each  stage  in  the  pipe  and  a  much  low¬ 
er  aliaang  probability. 

Pipes  with  ALU,  caches,  and  RAM 
(P-pipe) 

Our  second  experiment  involves  an 
MCM  (consisting  of  one  SN74LS181 
ALU,  one  Bbit  RAM,  and  nvo  16x8  data 
caches)  placed  in  a  four-stage  testing 
pipe.  A  16-bit  CBIT  suite  placed  at  the 


figund.  RondomneumeasunoftheCBlTsas  TPGsowr  six  stages  in  the  AiUpipetlM. 
•*  i**):  straight  pipe  [a};  cascaded  CBIT i  w/tfi  nonprimitive  polynonAd  (b);  cascaded 
CBITs  with  primitive  polynomial  (c). 
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Figure  7.  Aliasing  frequency  in  AlU  pipes:  at  the  sixth  stage  for  the  ALU  pipes  (a);  cas¬ 
caded  CBITs  with  primitive  polynomial  (bj;  test  kngris  *4  ML  over  six  stages  (cf. 


inputs  of  the  ALU  acts  as  the  TPG.  We 
Insert  four  ftbil  CBIT  cells  betw-een  the 
CUTS.  Demultiplexing  the  test  patterns 
from  the  Bblt  CBIT  connected  to  the  in¬ 
puts  of  the  RAM  tests  both  Address  and 
Data  inputs  exhaustively.  We  refer  to  this 
as  the  ‘straight  P-pipe.’’®  Adding  an  ex¬ 
tra  connection  between  neighboring 
CBITs  so  that  paired  Bbit-wide  CBITs 
can  perform  IG^it  signature  analysis  for 
two  CUTS  simultaneously  (Figure  3b) 
results  in  the  ‘cascaded  P-pipe.’ 

Randomness  of  the  TPG.  Figure  8 
shows  the  randomness  measure  of  each 
stage  with  different  test  lengtiis  for  the 
straight  P-pipe.  In  Figure  8a,  later  stage 
CBITs  reach  100%  randomness  when  the 
input  test  length  becomes  greater  than  28 
for  the  8bit  analysis.  In  Figure  8b,  the  zero 
stage  CBIT  gives  100%  randomness  for 
the  14-bit-wide  input  bus  to  the  ALU  after 
the  input  test  length  becomes  greater 
than  four  times  the  maximum  length. 
TTiis  finding  sliows  consistency  with  the 
previous  experiment  that  wanning  up  the 
zero  stage  CBIT  improves  the  quality  of 
the  TPG.  Figure  9  shows  the  behavior  of 
the  cascaded  P-pipe,  which  is  similar  to 
that  of  the  straight  P-pipe.  This  reaffirms 
the  accuracy  of  our  earlier  assumption 
that  the  maximum-length  PRS  generated 
as  the  test  length  is  at  least  four  times  the 
maximum  length  fora  gh-en  data  width 
of  the  CBIT suites. 

In  Figures  8a  and  9a,  LTA  requires  just 
four  times  the  maximum-length  PRS  gen¬ 
erated  by  one  CBIT  suite  to  test  neighbor¬ 
ing  CUTS  (4  X  2*),  as  opposed  to  the 
length  it  requires  from  a  double-width 
CBIT  suite  (2’®).  The  cascaded  CBITs 
show  a  quicker  rate  of  conve^nce  than 
a  straight  pipe,  ^ain,  this  is  similar  to  the 
results  of  the  previous  ALU  pipes. 

Aliasing  prolrnlrilily  of  signature 
anaij’sis.  We  Inject  the  single  stuck-at-0 
fault  to  the  least  ^gnificant  bit  of  the  ALU s 
output  Calculation  of  aliaang  probability 
occurs  comparing  agnatures  from  the 
faulty  pipe  with  those  from  a  fault-free 
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Stages  Stages 

Figun  8.  Randomness  measure  the  CBtTs  as  TPGs  over  lour  stages  hr  the  straight  P-pipe:  lest  lengths  are  multiples  of 256  (Lmm2^) 
{of;  test  lengths  are  multiples  of  16,384  (L  >  m2^^l  ff)l. 


pipe.  Figure  10a  (next  page)  shows  the 
aliasing  frequencies  per  injected  fault  of 
the  final  agnatures  in  the  last  stage  CBITs 
of  the  two  P-pipes.  Reading  the  signatures 
bytewbe  from  each  CBIT  cell  allows  us  to 
calculate  the  aliasing  frequency.  We  also 
analyze  complete  signatures  for  the  ex¬ 
tent^  CBrr  pairs.  All  aliaang  frequencies 
per  injected  fault  of  the  Bbit-wide  signa¬ 
tures  convei^  to  the  a^mptotic  value 
(2^  for  the  frbit  case).  However,  with  a  test 
length  smaller  than  one  cycle  of  the  maxi¬ 
mum-length  PRS  (2^,  the  cascaded  P-pipe 
gives  a  smaller  bytewise  aliasing  frequen¬ 
cy  than  the  straight  P-pipe.  Aliasing  does 
not  occur  for  the  extended  signatures  un¬ 
til  the  test  length  reaches  one  maximum 
length  for  the  I&bit  signature  analysis  (2'^ 
or65,53Q. 

Figure  10b  shows  the  aliasing  fre¬ 
quencies  for  the  intermediate  stages  in 
the  P-pipe  with  cascaded  CBITs.  The  last 
stage  still  gives  the  worst  analysis  result 
of  the  ALU  pipes.  Once  again,  warming 
up  the  P-pipes  seems  to  improve  the  test 
quality.  For  comparison,  we  show  the 
aliasing  from  extended  signature  analy¬ 
sis  (0(2"'®)  ->  1.52  X  10"®  for  the  com¬ 
plete  l&bit-wide  CBIT  suite]. 

In  these  experiments,  we  analyze  the 
randomness  of  the  TPG  and  the  aliasing 
problem  for  multiple-stage  parallel  sig- 


Figure  9.  Randomness  measure  of  the  CBITs  as  TPGs  over  four  stages  for  the  cascaded  P- 
pipe:  lest  lengths  are  multiples  of 256  (L  «  m2®]  (a);  lest  lengths  are  multiples  of  1 6,384 

lL-m2>Vlb}. 
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Figure  1 0.  Aliasing  frequency:  fourfh  stage  for  two  P-pipes  (a);  cascaded  CBITSwi^ 
primitive  polynomial  in  the  P-pipe  (b). 


nature  analyzers.  As  the  input  test  length 
traverses  the  entire  cycle  of  the  maxi- 
iniiin-lengtl)  PRS  provided  by  tlie  MISRs, 
it  vLsiLs  all  .states  in  the  PRS  at  least  once. 
TInis  we  have  100%  randomness  of  the 
maximum-length  PRS  during  the  TPG 
process.  This  also  applies  to  the  down¬ 
stream  .stages  in  one  pipe.  For  a  well- 
partitioned  pipelined  testing  path,  the 
aliasing  probability  is  of  the  same  order 
as  that  for  the  serial  signature  analyzer. 

Area  overhead  and  testing  time. 
To  implement  LTA  in  the  cascaded  P- 


pipe  from  the  straight-P  pipe,  the  only 
extra  wiring  needed  is  that  which  con¬ 
nects  the  CBIT  suites.  To  have  a  primi¬ 
tive  polynomial  for  the  extended  CBIT 
suite,  we  provide  spare  XOR  gates  for  the 
most  and  least  significant  suite  configu¬ 
rations.  Again,  we  require  only  two 
modes  for  initializing  the  CBITs  and 
MISR  mode  analysis. 

PerformaiKe  comparison  with 
other  approaches 

Two  aspects  make  LTA  with  CBITs 
comparable  with  other  testing  ap¬ 


proaches:  testing  time  and  area  over¬ 
head.  These  approaches  include  the 
IEEE  1149.1  boundary  scan  standard 
(JTAG)  and  a  pipelin^  BIST  with  con¬ 
flict  scheduling.  We  calculate  testing 
time  by  adding  the  set-up  time  (T^up). 
the  module  testing  time  (Tmoduie)  and 
the  readout  time  (Ttegdoui)-  Note  that  we 
normalize  these  times  to  the  average 
testing  time  per  module.  Adding  40  to 
80%  for  wiring  to  the  required  hardware 
components  provides  an  estimate  of  the 
area  overhead. 

LTA  does  not  require  different  CBIT 
cells  in  the  design  library  to  test  different 
widths  of  the  data  paths.  For  a  wider 
data  bus,  we  can  cascade  the  CBITs  to 
get  an  extended  PSA  without  downgrad¬ 
ing  the  quality  of  the  signature  analysis. 
LTA  thus  eliminates  the  hardware  penal¬ 
ty  required  by  different  sizes  of  BILBOs. 
(The  only  hardware  overhead  needed 
by  LTA  Is  the  zerostage  CBIT  for  a  new 
pipe  since  the  wiring  for  the  cascaded 
case  is  negligible.)  According  to  previ¬ 
ous  performance  analyses  of  CBfTs  ver¬ 
sus  the  boundary-scan  method,'®  CBIT 
lakes  less  than  10%  of  the  testing  time 
and  requires  less  than  twice  the  area  of 
boundary-scan  designs.  In  both  cases, 
the  fault<overage  is  100%.  In  our  exam¬ 
ples,  we  saw  that  in  the  six  stages  of  the 
16-bil  pipelined  CBIT  suites,  the  aliasing 
frequency  and  probability  stay  as  low  as 
0(2’’®)  for  a  .sufficiently  long  test  length. 
Therefore,  with  a  limited  area  penalty 
and  an  order  of  magnitude  improve¬ 
ment  of  the  total  testing  time,  LTA  drav 
tically  reduces  the  cost  of  MCM  testing  in 
today’s  competitive  market. 

Comparing  LTA  to  boundary 
scan.  The  boundary-scan  approach 
needs  two  separate  modes  and  careful¬ 
ly  selected  test  patterns  to  test  the  pro¬ 
cessor  for  the  interconnection  failure.' 
When  the  bit  width  of  the  communicat¬ 
ing  data  path  increases,  the  boundary 
scan  requires  more  complicated  test 
patterns  and  test  cycles. 

The  original  test  vectors  may  not  be 
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available  for  boundaiy-scan  testing  in 
MCMs  using  automatic  test  pattern  gen¬ 
eration.^  Therefore,  an  fV-input  QJT  re¬ 
quires  0(C  X  2^  test  patterns  for 
pseudorandom  testing.  C  ^  I  is  a  con¬ 
stant  given  by  a  statistical  estimation  on 
a  specific  test  pattern  generation  tech¬ 
nique.'**  The  optimized  value  for  C  is  1 . 
However,  to  have  confidence  that  the 
most  difficult  to  detect  faults  are  cow 
ered,  a  larger  C  is  required.'^  The  total 
time  needed  forone/^input  CUT  under 
boundary  scan  testing  is  given  by 

“  CCx  2^  X  (r^yp  +  r,n,idule'^ 

where  ^selupi  ^nKXlulei  Snd  /lyjHjaut  'fi® 

Scanjn,  one  execution,  and  Scan_out 
time  for  one  CUT.  However  LTA  gives 

^mdulc 

“(letup  +  (4  X  2^  X/yyJ^JyJy/A  +  /readout 

where  4  x  2^  is  the  maximum  given  by 
Equation  1  when  /n  =  2  and  k  is  the  total 
number  of  stages  of  a  pipe  implement¬ 
ing  LTA.  Thus,  LTA  saves  time  lor  scan- 
in,  scan-out,  and  pseudoexhaustive 
testing  per  moduIe/CUT,  esoecially 
when4«A«2'''.  For  example,  if /module 
=  /o  clock  cycles,  /i*,up=  16  clock  cycles 
and  /readour  =  16  cycles  for  a 
16-bit  input/ l&bit  output  CUT,  then  the 
boundary  scan  will  need  65,536C(32 
-I-  /q)  clock  cycles  to  finish  the  psuedoex- 
haustive  testing.  LTA  requires  32  -r- 
32,768  /q  clock  cycles  for  eight  stages. 

LTA  does  not  require  extra  testing 
time  in  a  separate  mode  for  intercon¬ 
nection  testing.  (In  contrast,  boundary 
scan  does.)  So,  the  total  testing  time  per 
module  and  its  interconnection  remains 

®it*!iup+2^I/iiiodule+  (tiiuefconiiecl  “  6) 

^  treaduull  I 


since  LTA  can  test  both  the  interconnec¬ 
tion  and  the  processor  logic  at  the  same 
time.  Here,  /pa«reoiii)«.i  represents  the  time 
for  signals  to  transfer  through  the  inter¬ 


connection  network,  its  value  is  negligi¬ 
ble  compared  to  the  other  (s  in  the  for¬ 
mula.  However,  if  the  interconnections 
have  a  long  propagation  delay,  we  can¬ 
not  ignore  /iniercoimeci-  Fof  boundary 
scan,  the  total  testing  time  per  CUT  with 
its  interconnections  is  given  by 

(Cx  2^)  X  (fjeiupall /module 
+  (/inleicanneci  ~  0)  +  /feadoul-all] 


where  and  /leadoumii  represent  the 
sums  of  scarv-in  time  and  scan-out  time, 
respectively.  The  lime  savings  over 
boundary  scan  again  indicates  the  effi¬ 
ciency  of  LTA  during  interconnection 
testing. 

Considering  the  area  overhead,  both 
LTA  and  boundary  scan  need  five  extra 
electrical  pads  for  one  processing  ele¬ 
ment.  However,  the  control  logic  of 
boundary  scan  requires  four  I/O  pins 
(lest-modeselect,  test-reset,  scan-in,  and 
scan-out). 

Due  to  the  XOR  gates  required  to  im¬ 
plement  the  extended  generating  poly¬ 
nomial,  LTA  consumes  more  area  than 
boundary  scan  does.  Our  previous  ex¬ 
periment'^  in  testing  the  MCM  sliows 
that  LTA  lakes  4,240  transistors  in  the 
four-stage  straight  P-pipe  case  and 
boundary  scan  takes  3,040  transistors  in 
total.  Therefore,  in  this  example  LTA 
consumes  about  39%  more  area  than 
the  boundary  scan.  However,  with  limit¬ 
ed  area  overhead  LTA  provides  a  supe¬ 
rior  BIT  implementation  with  improved 
state  coverage  and  exponentially  lower 
aliasing  probability.  Furthermore,  LTA 
provides  the  extensibility  for  the  PSAs  in 
terms  of  the  CBIT  implementation  and 
also  the  pipelining  for  several  CUTs  to  be 
tested  concurrently. 

Comparing  LTA  to  pipelined  BIST 
with  conflict  scheduling.  Other  pipe¬ 
lined  BIST  approaches  in  Abadir  and 
Breuer'^  alternate  modes  of  TPG  and 
PSA  in  one  LFSR  circuit.  Krasniewski 
and  Albicki’s  implementation'*  gives 
one-stage  analysis  for  all  the  CUTs  in  a 


pipelined  data  path  by  centralizing  tiro 
TIHj  and  distributing  the  PSA  (lor  exam¬ 
ple,  using  one  set  of  BILBOs  as  TPGs  for 
all  pipes  with  a  one-stage  BILBOCUI's- 
BILBO  structure  and  separating  outputs 
to  several  PSAs).  Establishing  conflict 
tables  makes  sure  that  the  LFSRs  |X‘r- 
form  TPG  and  PSA  seprarately  in  dilfer- 
ent  testing  schedules.  We  call  this 
approach  pipelined  BIST  with  conflict 
scheduling.  The  total  averaged  testing 
time  for  one  AAinput  CUP  (“kerner'*)  is 
given  by 


^module  l^ieadoul 
~  /setup  /module'*'  (2  —  1)  X  D 
■*■  /leaduul  (6) 


Here  2N represents  the  number  of  test 
patterns'^  and  D  represents  the  latency 
between  the  current  TPG  and  its  imme¬ 
diate  predecessor.  Tlie  minimum  (or 
optimized  value)  for  D  is  one  clock  cy¬ 
cle.  Usually  D  is  greater  than  one  clock 
cycle  because  generation  of  the  new 
test  pattern  cannot  occur  until  the  s>'s- 
tem  loads  the  previous  pattern  to  the 
CUT/kernel  as  soon  as  the  bus  is  avail¬ 
able. 

By  comparing  Equations  7  and  8.  we 
see  that  the  conflict  table  approach  re¬ 
quires  more  testing  time  than  LTA,  even 
though  we  set  the  optimized  pipelining 
scliedule  fur  the  best  value  of  D.  Tliis 
occurs  because  our  LTA  approach  uses 
the  fundamental  characteristics  of  the 
MLSRs  to  operate  simultaneously  as  TPG 
and  PSA.  Thus  LTA  eliminates  the  wait¬ 
ing  time  for  the  available  register  and 
bus  to  generate  a  separate  test  pattern 
for  the  CUT/kemel.  In  addition,  the  |X)s- 
sibility  that  additional  MISR  circuits  or 
interconnections  are  required  by  the 
conflict  scheduling  to  separate  TI*G  and 
PSA  modes  does  not  occur  in  LTA.  A 16- 
input/I&output  CUT  needs  a  testing 
time  of  at  least  65,567  -t-  /y  cl(x:k  cycles 
(calculated  from  Equation  8  with  = 
/module)  which  is  greater  than  the  32  + 
32,678  to  clock  cycles  given  by  the  8- 
stage  LTA  (/„  is  at  most  one  clock  cycle). 
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PIPELINED 


TEST 


Due  to  the  dual  TPG/PSA  mode  pro¬ 
vided  by  LTA,  we  reduce  the  exhaustive 
testing  time  by  cutting  the  time  required 
for  scheduling  conflicts  on  one  MISR.  In 
addition,  the  horizontal  (for  bit-size 
changes)  and  vertical  (for  multiple 
CUTs  to  be  tested  in  a  pipe)  extensibili¬ 
ty  given  by  LTA  provides  the  best  utiliza¬ 
tion  of  the  parallel  testing.  Therefore,  we 
can  perform  pipelining  and  achieve  par¬ 
allelism  on  one  system  with  minimal 
design  modification  and  optimal  test 
scheduling. 

Our  LTA  scheme  requires  less  testing 
time  compared  to  boundary  scan  and 
pipelining  with  conflict  scheduling  with¬ 
out  losing  the  effectiveness  of  the  test 
coverage.  We  anticipate  no  significant 
area  overhead  (except  for  the  spare 
XOR  paths  forcascadability)  in  LTA.  Fur¬ 
thermore,  we  believe  that  by  rearrang¬ 
ing  the  placement  of  the  CBIT  circuits 
and  including  test  scheduling,  we  can 
gain  high  testability  and  observabilit>’  of 
the  permanent  faults  In  both  the  proces¬ 
sor  and  hilerconnection. 


Wi  MOPosiD  A  CBIT  to  test  MCM  mod¬ 
ules  configured  in  a  pipelined  fashion. 
Cascading  the  CBlTs  produces  high  test 
coverage  with  100%  randomness  in  the 
TPG  process  and  low  aliasing  probabili- 
t}'  in  signature  anal}^^.  In  addition,  the 
('BIT  circuit  can  serve  as  a  switching  de¬ 
vice  for  module  reconfiguration. 

We  also  introduced  LTA  as  a  way  to 
reduce  aliasing  probability.  When  com¬ 
pared  to  the  GLFSR  approach,®  LTA 
gives  a  similar  aliasing  probability  as  the 
twofold  GLFSR.  LTA  implementation 
also  works  when  the  I/O  ports  are 
moved  to  the  center  of  the  chip  area  in 
the  future  sj-stem  design. 

LTA  exhibits  greater  efficiency  when 
we  partition  the  MCM  testing  ses.sions 
into  several  subcircuits  for  parallel  test¬ 
ing.  Yeh  et  al.  discuss  partitioning  algo- 
rithras  u.sing  netILst  as  inputs,'^  while 
Srinivasan  et  al. propose  partitioning  at 
a  higher  lev'el.  Seth  and  Agrawal  discuss 


in  depth  analysis  of  testability  of  parti¬ 
tioned  CUTs." 

Our  future  work  will  emphasize  inte¬ 
grating  partitioning  and  clustering  algo¬ 
rithms  with  LTA,  to  allow  automation  of 
hierarchical  functional  test  methodolo¬ 
gy  for  MCNte.  We  expect  that  modifying 
LTA  and  CBIT  circuitry  to  accommodate 
the  interconnection  reconfiguration 
and  self-purging  in  the  MCM  will  im¬ 
prove  fault  tolerance, 
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Abstract 

A  novel  approach  for  partitioning  circuits  with  high  fan- 
ins  which  are  not  suitable  for  pseudo-exhaustive  testing  is 
presented.  Circuits  under  test  (CUTs)  are  modeled  as  directed 
graphs  and  cost  function  is  developed  for  the  optimization 
algorithm.  Disjoint  circuit  partitions  are  generated  not  only  for 
reducing  the  exhaustive  test  length  but  also  for  pipelined 
testing.  Experiments  on  benchmark  circuits  demonstrate  that 
simulated  annealing  produces  good  results  for  future 
applications. 

Introduction 

Pseudo-exhaustive  testing  offers  100%  fault  coverage 
(16]  but  suffers  long  testing  time  for  large  fan-in  circuits. 
Partitioning  facilitates  testing  by  disconnecting  one  port  of  a 
network  from  the  other,  thus  making  smaller  subsystems  for 
built-in  self  test  (BIST).  This  simple  divide  and  cc,  uer 
suategy  has  been  adopted  to  reduce  the  input  size  of  ch 
combinational  circuit  for  improving  long  testing  time. 

Partition  for  complex  circuit  under  test  (CUT),  w.  .ng 
segmentation  cells  was  proposed  by  [1]  and  [2].  These 
segmentation  cells  can  be  flip-flops  added  to  the  circuitry. 
They  can  also  be  generated  by  moving  existing  registers  in  the 
CUT  to  the  boundary  of  the  partitions  and  applying  re-timing 
techniques  [3]  such  that  circuit  timing  requirements  can  be 
preserved  or  even  improved  along  with  better  testability.  The 
controllability,  detectability,  and  observability  of  partitioned 
circuits  can  also  be  measured  at  a  reduced  effort  comparing  to 
the  unpartitioned  circuit  (4). 

There  are  several  approaches  on  circuit  partitioning  for 
pseudo-exhaustive  testing.  Output  cone  size  reduction  [5] 
reduces  the  dependency  of  each  output  cone  of  the  CUT  when 
the  dependency  is  greater  than  a  user-specifled  number,  r.  This 
may  lead  to  overlapping  segments  at  the  expense  of  potential 
test  schedule  conflicts  or  extra  efforts  in  generating  exhaustive 
test  patterns.  Udell’s  [6]  gives  fewer  cut  points  than  a  two-way 
pariiiiv')n  and  clustering  approach  (7J.  .Area  overhead,  not 
expliciii)  represented  in  the  cost  function,  is  introduced  by  the 
forward-merging  and  collapsing  the  reconvergent  fanouts. 
Bhatt  et  al  [2]  gives  a  leveled  approach  tty  ing  to  preserve  the 
timing  of  a  CUT  by  inserting  extra  segmentation  cells.  A 
coordinated  circuit  panition  and  testing  pattern  generation 
approach  by  Jone  and  Papachristou  [8]  ensured  disjoint 
ibptteau  by  inserting  more  registers  at  the  segmental 
t  boundary. 

I  In  a  continuing  effort  to  systemize  our  pipelined  built-in 
>  cascadable  test  (CBIT)  technique  (9)  for  very  large  integrated 


circuits  and  systems,  we  perform  partitioning  to  improve  test 
performance  by  marking  cut  points  for  segmentations. 
Simulated  Annealing  technique  [10]  was  used  to  find  a  global 
minimal  solution  under  given  performance  constraints. 
Disjoint  partitioning  is  proposed  for  implementing  pipelined 
pseudo-exhaustive  testing  which  has  the  potential  of  being 
fast,  efficient,  and  hierarchical  for  designing  testable  circuits. 
Experiments  on  the  1985 ISCAS  benchmark  circuits  [11]  were 
performed  and  the  results  are  encouraging  when  compared  to 
the  approaches  in  [5][6]. 

This  paper  is  organized  as  follows.  We  first  explain  our 
partitioning  strategy  for  pipelined  pseudo-exhaustive  testing. 
The  theoretical  investigation  and  problem  formulation  follow 
to  model  the  CUTs  and  the  cost  function  for  optimization. 
Simulated  annealing  algorithm  is  outlined  and  its  application 
to  find  optimal  partitioning  is  briefed.  Finally  simulation 
results  on  ISCAS  85  benchmarks  arc  presented.  Discussion  on 
future  work  follows. 

Formulation  of  the  Problem 

The  partitioning  for  pipelined  pseudo-exhaustive  testing 
can  be  formulated  as  the  classical  rn-way  partitioning  problem 
with  user-specified  constraints.  To  provide  a  better 
understanding  and  easier  handling  on  the  disjoint  ni-way 
partitioning  problem,  it  is  desirable  to  map  the  CUT  to  a  graph 
representation  and  uansforming  the  consbaints  into 
mathematical  equations.  Thus  we  can  model  the  problem  as 
the  /n-way  graph  partitioning. 

Simulated  annealing  determines  the  number  of  partitions, 
m,  after  given  a  big  initial  value  (say  50)  by  returning  empty 
partitions.  Those  intermediate  segmentation  ceils  can  be 
grouped  forming  several  registers  with  similar  suuctures  as 
the  multiple-input  shift  registers  (MlSRs)  [12].  In  this  way,  the 
CUT  is  setup  for  a  poieniiai  pipelined  scheduling  during  the 
testing  mode. 

A.  Graph  Construction 

A  circuit  can  be  represented  as  a  weighted,  directed  graph, 
C  =  (  F,  £)  ,  H  here  V  is  the  set  of  venices/nodes  in  G  and 
£  is  the  set  of  edges  connecting  two  nodes  in  F.  Modules  or 
gates  in  the  circuitiy  are  modeled  as  vertices  in  the  graph,  i  e.. 

Ill 

V  =  KJ  {  v,  }  ,  where  I V]  =  W  is  the  number  of  vertices  in 

.  =  1  G. 

Signals  are  modeled  as  edges  with  directions  from  source  to 
sink  in  the  network,  e.g.,  c/j  is  from  node  v,  to  node  vj.  When 
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the  signals  ore  equally  weighted,  (he  original  graph  reduces  to 
a  directed  gra|^. 

An  m-way  partition,  FI  :  ^-4  { 1,2,3,.. aiKl 
#  * 

K, 


,  =  {  V  6  V)  ( v)  =  i,  1  ^  i  m  }  ,  is  an  operation 
such  that  V  =  Two  different 

partitions,  and  are  called  neighbors  if  there  exists  a 
move,  O,  which  is  a  permutation  of  the  v^’s  in  the  current 
partition  r  ,  O:  fl  — >  FI  ,  such  that 

o(  {«,}(:,,)  =  wir..- 

Fig.  1  shows  an  example  of  the  construction  of  G.  The 
original  schematics  of  a  CUT  is  shown  in  Fig.  ta.  This  small 
test  circuit,  cl7  from  ISCAS  8S  benchmark,  has  five  primary 
inputs,  six  NAND2  gates  and  two  primary  outputs.  Fig.  lb 
shows  the  equivalent  graph  representation  of  this  circuit. 


^=Oo>- 


j=QT> - D 


Fig.  1  a  Schcmailcs  represcniatlon  of  a  cltcuU  under  test  (CVT) 


Fig.  I  b  Graph  represcmallon  of  a  circuit  under  lest  (CUT) 

B.  Partitioning  with  Input  Constraints  (PIC) 

Here  we  provide  a  study  on  the  fundamentals  for  this 
disjoint  ni-way  graph  partitioning  problem  and  prove  it  to  be 
NP  hard.  A  function,  w,  was  deFined  as  a  mapping  w; 
E  I  (integer)  such  that 

w(Jt,)  =  I  VJ  =  e., 

where  e.  is  the  total  number  of  non-seif,  incoming  edges  to 
partition  ir.. 

For  a  given  graph,  G  =  (  V,  E)  ,  and  an  integer,  1.  we 
want  to  partition  V  into  m  subsets,  R,,  Rj,  ...,R  with  an 
objective  to  minimize  m  under  the  constraint  (hat  wT R.)  ^  /, 
V/,1  <i<m.  ' 


NP  COMPLETENESS;  The  partitioning  with  input 
constraints  is  NP  complete. 

Proof: 

We  first  slate  a  3-Partition  Problem  which  is  NP 
complete.  Then  we  show  that  a  3-partition  problem  can  be 
transformed  to  the  problem  of  partitioning  with  input 
constraints. 

(i)  3-Partition  Problem  [17J: 

A  set  V  of  3k  elements,  a  bound  BE  ,  and  a  size 


JV/J 


s(v)  6  Z*’  for  each  vE  V  such  that 
^4<j(v)<E/2  and  such  that 
s(v)  =  kB. 

*e  V 

It  is  NP-complete  in  (he  strong  sense  to  partition  V 
intokdisjoinisets,  i.e.,  V.,  V., ...,  V_  such  that  for 
1^/^*,  JjsCv)  =  B. 

(ii)  For  each  inember  Vg  in  the  3-Partition  Problem,  we 
construct  a  node  v,  with  s(v,)  distinct  inputs. 
Suppose  I  —  B,  if  we  can  achieve  m  =  k,  we  can 
solve  the  3'Partilion  Problem.  Therefore,  the  PIC 
problem  is  NP-complete. 

This  PIC  problem  is  still  NP-hard  even  when  some  nodes 
are  duplicated  in  different  partitions.  A  statement  can  be 
drawn  from  the  above  proof: 

The  partitioning  with  input  constraints  remains  NP- 
complete  even  if  we  allow  replicates,  i.e.,  R,  O  R^  ^ 

C.  Cost  Function 

An  N  input  CUT  can  be  tested  pseudo-exhaustively 
when  it  is  divided  into  m  segments  according  to  the  constraint 
that  each  partition  cannot  have  more  than  f  inputs  .(16].  The 
resulting  test  length  for  each  partition  is  at  most  2  which  is 
much  smaller  than  the  original  2^.  If  the  partitioned  CUTs  do 
not  overlap  one  another,  the  time  dependency  for  testing  those 
segments  is  minimal  such  that  the  partitions  can  be  tested 
concurrently.  When  resource  sharing  conflicts  exist  among 
those  m  partitions  during  testing,  the  worst  case  is  to  test 
partitions  sequentially.  Therefore,  the  total  testing  time,  T,  for 
a  partitioned  CUT  is 
max  { r,}  ,  :S  r  ^  r,.  where  t,  is  the  total  testing  | 

i  mm 

time  needed  for  each  segment.  T  can  be  as  fast  as  ail  the 
partitions  tested  in  one  session  or  be  as  slow  as  to  test  one 
partition  after  the  other. 

The  objective  of  our  approach  to  partition  for  pipelined 
pseudo-exhaustive  testing  can  be  represented  as  a  cost  ^ 
function  on  the  circuit  graph,  G.  The  major  concerns  in  out 
implementation  are  the  tot^  number  of  incoming  edges  to  all 
partitions  should  be  minimal  and  the  total  number  of  CfifTt 
needed  (thus  the  area  overhead)  is  minimized.  To  avoid  the . 
occurrence  of  big  e,  summing  with  small  e.  in  the  minimized 
lota)  number  of  incoming  edges,  we  want  to  minimize  the  f 
maximum  value  of  e,  also. 

In  order  to  make  the  penalty  of  incoming  edges  to  eadi ) 
partition  significant  and  total  number  of  CBITs  is  minimized, ; 
the  cost  function  is  defined  as 


(t.-i) 


I  i^m  l^i^m 

max  {t.}  -  llT*  _ 

+  c  ’  0)| 

,  where  «  —  \  fl  is  the  total  number  of  CBITs  ,'1 

.sftJ '  I  :| 


needed  for  pipelined  pseudo-exhaustive  testing. 
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Simulated  Annealing 

Simulated  annealing  has  been  shown  to  have  good 
results  in  solving  problems  searching  for  global  optimum  (or 
optima)  such  as  constraint  partitioning  [10][13].  For  our  m- 
way  partition,  we  adopt  the  simulated  annealing  algorithm 
(14]  as  the  core  algorithm  and  developed  a  problem-specific 
modifications  for  the  whole  annealing/optimization  process. 

A.  The  Generic  Algorithm 

Simulated  annealing  uses  a  parameter  called 
temperature  to  control  the  probability  of  accepting  a  random 
uphill  move.  An  *  uphill  move’  is  denoted  to  as  the  randomly 
picked  solution  having  a  higher  cost  than  current  solution. 
This  is  designed  to  avoid  being  trapped  in  local  optima.  A 
cooling  ratio,  y,  where  0  <  Y  <  1 1  is  to  control  how  the 
temperature  is  gradually  lowered  until  the  stopping  criteria  ate 
reached.  We  let  L  be  the  number  of  random  trials  for  a  given 
temperature,  where  L  =  (size  of  the  neighborhood,  5)  X 
SIZEFACTOR,  which  gives  the  size  of  the  problem.  There  ate 
S  different  neighboring  states  for  a  single  move  from  the 
current  partition.  SIZEFACTOR  is  an  experimental  parameter 
which  gives  mote  chances  for  the  random  trials  to  cover  more 
neighbors.  Table  1  shows  the  generic  algorithm  of  simulated 
annealing  and  Step  3  is  the  Metropolis  loop  {14]. 


STEP  1  Generate  a  random  initial  solulion  /  ^  ^ 

calculate  its  cost,  C.  < I  i  =  i 

STEP  2  Choose  an  initial  temperature  r«.>0. 

STEP  3  While  (slop  criteria  not  meli  do 

3.1  Perform  the  following  random  trials  L  times  for  one  7: 

3.1.1  Pick  a  random  ne/y/itwr of  • 

3.1.2  Calculale  Ihe  cosi  of  Ihe  neightxir,  d~ 

3.1.3  Let  A  =  C  -C. 

3.1.4  If  (random  <e  ^'^^)  .accept  {»'.}  j" 

(ff  A  £  0,  do  a  downhill rrKNa,  otherwise  ac^i 
with  a  probability  «  tor  Ihe  uphill  move.) 

3.1.5  Update  {*.}  T  to  {x'.l  I"  and 
CtoC'. 

3.1.6  It  Cis  better  than  besLC,  update  besLCto  C 

and  1"^,  to 

3.2  Change  r to  yt,  , 

STEP  4  Hstum  best  solution  '  li  =  i . 

r«blc  It  Generic  Simulated  Annealing  Algorithm 

B.  Finding  A  Neighbor  and  Move 

Two  sets  of  partitions  are  called  neighbors  if  we  can  find 
amove  from  one  to  the  other.  In  this  work,  we  adopt  the  vertex 
move,  in  which  a  single  vertex  in  one  partition,  n.  is  moved 
to  Ihe  other,  n.,  such  that  for  an  Af  node  graph,  the  size  of  the 
neighborhood  for  an  m-way  partition  is  5  =  NX  (m  —  1)  . 
The  set  of  neighbors  generated  by  the  vertex  move  has  Ihe 
advantage  of  smaller  increases  in  the  size  of  the 
neighborhood,  which  is  O  ( A()  for  a  given  m,  whereas  vertex 


exchange  will  give  O(N^)  asN  increases  ( 1 3}. 

C.  The  Cooling  Schedule 

The  cooling  ratio,  y,  is  a  reducing  factor  for  ilie 
annealing  process  gelling  out  of  local  optima  and  moving 
towards  a  global  minimum.  To  improve  the  high  rejection  rale 
at  low  temperatures,  an  Adaptive  Cooling  Schedule  was 
proposed  in  [IS].  One  analysis  on  the  adaptive  cooling 
schedule  (10]  shows  longer  annealing  time  always  gives  better 
results  regardless  the  design  of  the  cooling  scliedule. 
Therefore,  we  reduce  ihe  temperature  by  a  factor  of  y  per 
Metropolis  iteration  for  the  cooling  schedule. 

Experiments 

Tlie  simulations  are  implemented  in  C  language  and 
conducted  on  Sun  Sparc  1-t-  stations.  For  best  results  in 
simulated  annealing  technique,  sample  experiments  were 
performed  for  obtaining  proper  setting  of  the  parameters. 

A.  Setting  the  Parameters 

There  are  several  parameters  in  the  generic  algoriihni 
need  to  be  tuned  [10]  for  simulated  annealing:  y,  the  cooling 
factor,  SIZEFACTOR  to  determine  the  number  of  iterations 
for  the  Metropolis  loop,  and  MINPERCENT  as  the  minimum 
acceptance  ratio  for  slopping  at  a  low  temperature.  Here  we 
adopt  the  analysis  from  [10]  and  [13]  for  the  following 
settings:  SlZEFACTOR=l6,  and  y  =  0.95.  MINPERCENT 
is  set  to  be  0.005  according  to  sets  of  trial  simulations. 

There  is  one  stopping  crilerium  in  our  simulation:  when 
the  acceptance  ratio  is  below  MINPERCENT  for  5 
consecutive  temperatures.  This  is  to  slop  the  simulation  when 
most  of  the  trials  are  rejected. 

B.  Results  and  Comparison 

A  generally  accepted  set  of  test  cases  is  the  ISCAS  85 
benchmarks.  Table  2  shows  the  results  for  a  given  input 
constraint,  CBIT_width=l=J6.  The  ’internal  net  cuts’ 
suggested  by  the  simulated  annealing  algorithm  is  for  future 
placement  of  the  modified  flip-flops  which  construct  one 
CBIT dedicated  to  one  partition  for  testing.  These  ’cut  points’ 
are  for  pipelined  pseudo-exhaustive  testing  performing 
disuibuted  test  pattern  generation  (TPG)  according  to  the 
input  number  constraint  for  circuit  clusters  in  a  CUT.  As 
outlined  in  Table  5,  pipelined  pseudo-exhaustive  testing 
outperforms  existing  approaches  (e.g.,  PEST  [19])  by  saving 
expensive  centralized  lest  pattern  generator  and  extra  I/O 
pins/points  for  conlrolling/observing  the  internal  cut  points. 
Test  quality  degradation  for  pipelined  pseudo-exhauslive 
testing  is  O  (2~^)  as  the  number  of  pipelining  stages,  k,  is 
much  smaller  comparing  with 

Table  3  and  Table  4  list  the  previous  results  from  similar 
approaches  in  [5]  and  [6].  Since  our  partitioning  strategy  is 
btsed  on  the  pipelined  testing  framework  [9],  which  is 
different  from  the  testing  architecture  in  [5]  and  [6],  Tables  3 
and  4  are  for  reader’s  reference  only. 
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Conclusion 

We  have  demonstrated  that  circuit  partitioning  for 
pipelined  pseudo-exhaustive  testing  can  be  formulated  as  the 
NP-complete  m-way  partition  and  approached  by  simulated 
annealing.  Experimental  results  demonstrate  the  superiority 
when  simulated  annealing  is  combined  with  a  complete  set  of 
cost  function  and  parameter  setting. 

Future  works  can  be  investigated  in  related  areas  such  as 
a  thorough  study  on  adding  a  pre-processing  program/module 
to  recognize  user-specified  ’untouchable’  cells  (or  hard 
modules).  Extension  to  sequential  circuits  with  loop  cutting 
and  re-timing  techniques  will  give  a  good  modification  to 
current  approaches  for  a  better  global  optimization. 
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Circuit  Name 

Imemal  netscut  by  SA 

no.  of  partitions 

C432 

49 

■  ■  8 

C499 

47 

8 

C880 

72 

12 

C13SS 

93 

28 

CI908 

203 

45 

C2670 

263 

54 

CsJTs 

548 

124 

CSil 

nsss 

164 

Thbic  2t  Partition  Results  from  Eq.  (1)  for  NI6 


Circuit  Name 

no.  of  c 
CSR  15) 

It  points 
ESP-BIST  [6] 

ntemalnetscutby 

SA 

L’432 

J, 

27 

49 

C499 

8 

8 

47 

C880 

11 

11 

72 

C1355 

8 

8 

93 

CI908 

18 

21 

203 

C2670 

33 

33 

263 

csJB 

54 

54 

548 

C6288 

- 

84 

1069 

Table  3:  No.  of  Cuts  Referencing  to  Related  Works  for  f=l6 


Circuit  Name 

no.  of  pi 
ESP  (18) 

rtillons 

SA 

■■L'43r~ 

-25 

jj 

C499 

-40 

8 

C880 

-40 

12 

C1355 

-40 

28 

C1908 

-75 

45 

C2670 

-60 

54 

C5315 

-150 

124 

C6288 

-130 

164 

Table  4t  No.  of  Partitions  Referencing  to  Related  TVorks  for  (=16 


FactorYTest  Approaches 

PEST  (19) 

Pipelined  Pseudo-exhaustive  Testing 

Testing  Time 

r  +T  (  =»2  — 1) 

r,  «2  -i) 

Xmm  UIXP  '  ' 

Area  Overhead 

'^ScaH  -  path  ^MISK 

m/ll  (a,  ,) 

'  Seam -path'  MISR 

Extra  TPG  QrcuiUy 

Extra  Boundary  Scan  Circuitry 

NONE 

Fault  Coverage/TesI  EITectivcncss 

1-2-'' 

1  -  *  X  2-"  from  (9) 

Table  5:  Comparison  between  PEST  Approach  and  Pipelined  Signature  Atwlysb 


17.4.4 


420 


Exhibit  IV 


Submitted  to  the  Sixth  IEEE  Symposium  on  Parallel  and  Distributed  Processing 
Dallas,  TX  -October  26-29,  1994 


Performability  Analysis  of  Non-repairable  Multicomponent  Systems 

Using  Order  Statistics  * 


Amiya  Bhattacharya 
CSE  Department 
(619)534-7883 
amiya@cs.ucsd.edu 


Ramesh  R.  Rao 
ECE  Department 
(619)534-6433 
rrao@ucsd.edu 


Ting-Ting  Y.  Lin 
ECE  Department 
(619)534-4738 
lin@ece.ucsd.edu 


University  of  California,  San  Diego 
La  Jolla,  CA  92093 


Abstract 

Perfonnabih'ty,  a  composite  measure  that  integrates  both  performance  ^UKi  reliability,  has  been 
deemed  to  be  essential  in  evaluating  systems  that  are  capable  of  trading  off  performance  for  reliabil¬ 
ity  under  component  failures.  For  non-repairable  systems,  the  goal  is  to  evaluate  the  distribution  or 
moments  of  some  accumulated  reward  (performance)  defined  on  a  stochastic  process  that  characterizes 
the  different  configurations,  as  successive  faults  appear.  In  this  paper,  instead  of  assuming  that  the 
distributions  of  sojourn  time  at  various  configurations  are  known,  we  deduce  them  from  the  distribu¬ 
tions  of  the  individual  component  lifetime.  We  allow  lifetimes  to  be  arbitrary  but  assume  that  they  are 
independent  from  one  component  to  another.  Knowledge  of  the  reward  rates  in  different  configurations 
allows  us  to  relate  the  sojourn  times  to  the  order  statistics  of  the  component  lifetimes.  We  then  solve 
for  the  distributions  of  the  order  statistics.  The  approach  is  illustrated  in  context  of  a  paraDel  system 
made  of  identical  components. 

Keywords:  Graceful  degradation.  Reliability,  Performance,  Sojourn  time. 


1  Introduction 

Stochastic  modelling  and  analysis  have  been  extensively  used  in  quantitative  evaluation  of  both  the  reliability 
and  performance  of  computer  or  commimication  systems.  Traditionally,  they  have  been  studied  as  two 
separate  aspects.  However,  a  need  for  a  comporite  metric  for  integrating  these  two  arises  for  analysing 
the  so-called  degradable  systems.  By  virtue  of  thmr  fault  detection  and  reconfiguration  capability,  they  can 
operate  in  several  degraded  modes  as  a  consequence  of  component  failures. 

Degradable  systems  form  a  special  class  of  fault-tolerant  systems  where  reliability  can  be  traded  off  at 
the  cost  of  performance.  Performability,  a  measure  that  combines  both  performance  and  reliability,  has 
been  developed  as  an  accepted  standard  to  capture  this  trade-off  [2,  17].  It  is  defined  as  the  probability 
that  the  system  readies  an  accomplishment  level  over  a  utilization  interval  called  the  mission  time.  As 
faults  occur,  the  system  goes  through  different  configurations  that  can  be  characterized  by  a  stochastic 
process.  Oftentimes,  even  if  the  system  is  potentially  repairable,  the  designer  is  concerned  mth  the  transient 
behavior  of  the  system  over  the  mission.  Thus  the  problem  reduces  to  the  evaluation  of  the  distribution 
and/or  moments  of  some  accumulated  performance  metric  (reward)  defined  on  the  stochastic  process.  This 
freedom  of  choice  for  the  reward  makes  the  measure  more  powerful  and  versatile. 

The  first  effort  at  integrating  performance  and  reliability  for  a  degradable  computing  system  was  made 
by  Beaudry  [1],  where  the  total  amount  of  computation  available  from  a  degrading  computer  system  was 
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analyzed  undtf  a  Markov  model  of  processor  failure.  In  another  early  work,  Huslende  [4]  proposed  a  different 
measure,  the  probability  that  the  system  performance  remains  above  a  certain  level  at  all  time  during  a 
miiiMon-  In  the  framework  of  peiformability  introduced  by  Meyer  (2],  it  was  clear  that  both  these  measiures 
(and  many  others)  cotild  be  interpreted  as  different  assignment  of  reward  rates  [8]. 

Meyer  [5]  and  Rirchtgott  and  Meyer  [6]  showed  how  to  obtain  closed  form  solution  to  the  performability 
distribution  by  concfftioning  on  the  state  trajectories  (sample  paths).  We  closely  follow  the  outline  for 
analysing  systems  with  arbitrary  stochastics  of  fiulure  that  was  presented  in  [6].  Before  we  present  our 
approadi,  it  should  be  noted  that  the  integral  solutions  whidi  arise  require  enumeration  of  all  possible 
state  trajectories  involving  numerical  algorithms  with  linear  complexity  in  terms  of  the  trajectories.  For 
a  large  system,  the  computations  involved  could  be  very  complex  depending  on  the  degradation  pattern. 
A  semi-Markovian  characterization  of  the  underlying  stodiastic  process  was  considered  appropriate  for  a 
non-repairable  degrading  system  with  independent  frulures  [9],  and  found  to  lead  to  tractable  solution  [6]. 
However,  as  illustrated  by  Ciardo  et  al.  [12],  identifying  the  sojourn  time  distribution  in  the  semi-Marlu>v 
model  may  not  be  straightforward  for  a  system  with  complex  configuration. 

Due  to  this  complexity,  most  researchers  analyzed  performability  using  the  Markov  reward  model  [5,  7, 8, 
9,  10, 11, 14, 19],  where  the  problem  reduces  to  the  transient  analysis  of  the  underlying  Markov  chain.  This 
was  basically  a  generalization  of  Beaudry’s  idea  captured  in  the  new  framework  of  performability.  However, 
due  to  the  memoryless  property  of  Markov  process,  it  implies  an  exponential  distribution  of  sojourn  time,  and 
a  constant  hazard  rate.  In  practice,  system  components  have  been  better  characterized  by  varying  hazard  rate 
over  the  lifetime,  often  modelled  by  WeibuU  distribution  [18].  Although  use  of  exponential  approximation  to 
non-exponential  distribution  may  be  sufficient  for  a  steady-state  analysis  [18],  it  can  introduce  considerable 
error  in  transient  analysis  [12]. 

The  definition  of  performability  we  use  is  derived  from  Meyer  [2].  The  system  we  consider  here  has  been 
studied  earlier  in  [5]  and  [8],  where  the  vmderlying  sto<hastic  process  was  assumed  to  be  Markov,  ^ving  rise 
to  the  memoryless  distribution  of  sojourn  times  at  all  degraded  configurations.  The  method  outlined  in  this 
paper  is  a  more  general  in  this  regard.  We  deduce  the  distributions  of  the  sojourn  time  from  the  distributions 
of  individual  component  lifetimes,  that  are  assumed  to  be  independent  from  one  component  to  another.  The 
lifetimes  need  not  be  memoryless;  they  can  be  arbitrary.  A  semi-Markovian  process  automatically  arises  out 
of  the  order  statistic  formulation,  but  there  is  no  need  for  explicitly  defining  its  sojourn  time  distribution. 

The  rest  of  the  paper  is  organized  as  follows.  The  definition  of  performability  and  related  measures  is 
presented  in  section  2,  along  with  the  related  assumptions  and  an  example.  In  section  3,  we  formulate  the 
performability  problem  for  a  system  with  no  critical  component  in  terms  of  the  order  statistics  of  random 
variables  representing  component  lifetimes,  and  derive  closed  form  solutions  for  the  related  measures.  These 
results  are  then  extended  to  the  system  with  critical  component  in  section  4.  In  section  5,  some  numerical 
values  for  the  derived  expressions  are  plotted  to  illustrate  the  effect  of  the  critical  component  and  its  mean 
lifetime.  We  conclude  the  paper  in  section  6  by  summarizing  the  contribution  with  a  focus  to  further  work. 


2  Overview 

Let  {A't}i>o  be  a  continuous-time  discrete-space  stodiastic  process  representing  the  state  of  the  degrading 
system.  The  state-space  Q  is  finite  for  any  practical  ^stem,  however  large  it  may  be.  Each  state  q  £  Q 
represents  a  possible  configuration  of  the  system  with  some  components  failed,  and  has  an  assodated  reward 
rate  p,  =  p{q),  a  non-negative  real  number  refiecting  the  level  of  performance  per  unit  time,  when  the  system 
is  in  state  q.  The  system  we  analyze  here  is  made  of  n  identical  components,  and  it  degrades  gracefully  when 
the  components  fdl  one  by  one.  The  states  are  qs,  k  =  0, . . . ,  n,  where  qt  denotes  the  state  with  k  faulty 
components. 

Performability  of  the  system  over  a  mission  time  t  is  defined  as  the  probability  density  function  of  the 
accumulated  reward  (performance) 

Y,  =  l*p{Xt)dt.  (1) 

The  cumulative  distribution  function  of  Fi  is  often  rderred  to  as  the  performability  distribution  function. 
The  mean  value  of  Yt,  which  we  would  call  mean  reward  before  failure  (MRBF),  is  a  useful  characterization. 
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For  a  nonrepairable  system,  the  transitions  betweoi  configurations  take  the  shape  of  an  acyclic  graph,  so 
that  reentry  to  a  state  is  not  feasible.  The  sojourn  time  or  the  residence  time  r,,  which  is  the  total  time  spent 
by  the  system  in  configuration  q  over  the  mission  time  t  =  ^  therefore  contiguous.  Equation  (1) 

thus  reduces  to 

=  (2) 

which  also  provides  a  simple  expression  for  MRBF 

When  the  accumulated  reward  until  the  complete  ^stem  failure  is  of  interest,  one  should  consido*  the 
value  of  It  as  t  oo.  The  corresponding  measures  to  be  considered  are  the  distribution  fimctions  and  mean 
value  of  ^ 

Y=\imYt=[  p{Xt)dt,  (4) 

which  we  henceforth  consider  in  this  paper. 

Example  Let  us  consider  a  single  component  system  having  just  the  working  and  failed  states.  If  we  assign 
p  =  1  for  the  fault-free  state  and  p  =  0  for  the  failed  one,  the  performability  distribution  function  reduces  to 

Pr[y<yl  =  Pr(r<y], 

where  T  is  the  random  variable  representing  the  lifetime  of  the  component.  The  reliability  R{t)  can  be 
written  as 


Rit)  =  Pr(r>t) 

=  i-Pr[r<ti 

=  1-Pr[y<t]. 

Clearly,  the  MRBF  reduces  to  the  mean  time  to  failure  (MTTF)  [18].  □ 

One  implicit  assumption  behind  formulating  a  composite  performance-reliability  measure  as  above  is  that 
the  performance  and  failure  rates  are  functions  of  only  the  system  state.  This  means  that  increasing  load 
or  stress  during  the  degradation  due  to  component  failure  does  not  affect  the  lifetime  distribution  of  the 
surriving  components.  This  is  not  only  a  feature  whidi  makes  the  analysis  tractable,  but  also  a  desired 
char2u:teristic  in  many  implementations. 

3  Order  Statistics  Formulation  of  Performability  without  Criti¬ 
cal  Component 

In  the  previous  section  we  identified  the  state-spsice  associated  with  the  graceful  degradation  of  an  n- 
component  system.  We  start  this  section  with  the  assumption  that  all  the  states  qic,k  =  0,...,n  are 
visited  in  that  strict  order  as  the  system  degrades  gracefully.  In  effect,  we  ignore  the  possibility  of  any 
abrupt  failure  caused  by  the  fmlure  of  any  critical  component  that  holds  the  n  components  together.  The 
boundaries  of  the  sojourn  times  can  now  be  identified  with  the  time  of  failmre  of  the  successive  components. 
These  are  precisely  the  order  statistics  of  the  random  variables  representing  lifetime  of  the  components.  In 
subsection  3.1,  we  formally  introduce  the  order  statistics,  and  explore  how  their  distributions  relate  to  the 
distributions  of  the  component  lifetimes.  Subsection  3.2  shows  the  result  from  a  traditional  Markov  model  of 
the  order  statistics  variables  to  be  a  spedal  case  of  these  formulations.  Finally  in  subsection  3.3,  the  closed 
form  expressions  for  the  performability  distribution  and  MRBF  are  derived  in  terms  of  the  distributions  of 
the  order  statistics  of  component  lifetimes. 
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3.1  Order  statistics  of  lifetime 


If  a  family  of  random  variables  Ti ,  Tj, . . . ,  Tn  are  arranged  in  ascending  order  of  magnitude  and  renamed  as 
7i;n  <  Tr.n  <  *  *  *  <  7n;nt  the  new  random  variable  Tk-.n  is  called  the  fcth  order  statistic  for  fc  =  1, 2, . . . ,  n. 
Suppose  TitTtf...  ,Tn  represent  the  lifetimes  of  the  identical  components,  and  as  a  result,  they  are  iid 
(independent  and  identically  distributed)  with  cdf  F(t)  =  Pr  [Tj  <  t].  The  c^  Fk-.n{t)  of  the  order  statistics 
7^:11  of  lifetimes  can  be  derived  as 


(5) 

which  is  the  tail  probability  of  a  binomial  distribution.  Using  an  identity  (see  appendix  for  details).  This 
can  also  be  written  as  an  incomplete  beta  function, 

Fk-.n{t)  —  Ipit)  (fc,  n  -  fc  +  1) 

To  derive  the  pdf  /t;n(t),  let  us  consider  the  probability  of  the  event  that  the  ibth  failure  occurs  in 
[t,  t + jt].  This  means  A;  —  1  components  must  fail  in  [0,  t],  and  n  —  k  must  fail  in  [t  +  Jt,  oo).  Moreover,  there 
are  n!/(fc  —  l)!l!(r»  —  A:)!  ways  to  choose  the  first  A:  —  1,  the  A:th  and  the  rest  n  —  A:  components  out  of  n.  As 
all  components  fail  independently,  we  have 

“  (ifc  -i)l^(n-  IO!  +  *»"■*• 

Dividing  both  sides  by  St  and  taking  limit  as  St -kO,  we  get 

=  (t-iyin-io!  P) 

The  same  expression  for  fk:n{t)  can  also  be  obtained  by  differentiating  Fk.nit)  in  equation  (6)  with  respect 
to  t. 

A  straighforward  extension  of  the  above  argument  yields  an  expression  for  the  joint  pdf  of  two  or  more 
order  statistics  of  the  lifetime  distribution  of  individual  components.  For  the  pairwise  joint  pdf  <  S 

t,  we  consider  the  event  that  the  tth  failure  occurs  in  [s,  s  +  js]  and  jth  failure  occurs  in  [t,  t  +  Jt].  The  first 
t  —  1  failures  occur  in  [0,  s],  j  —  t  —  1  intermediate  failures  occtir  in  [s  +  Js,  t]  and  the  remaining  n  —  j  fEulures 
occur  in  (t  +  St,  oo).  The  choice  of  the  sequence  can  be  done  in  n!/(i  —  l)!i!(j  —  i  —  l)!l!(n  —  j)!  ways.  Due 
to  the  independence  of  their  lifetime 


Fk:n(t)  =  Prlrk:„<tl 

=  Pr  (A;  or  more  components  fail  in  [0,  t]  ] 

fl 

=s  ^  Pr  [Ebcactly  t  components  fail  in  [0,  t]  ] 
i=k 

i=fc '  ■' 


Pr  [s  <  Ti:„  <  8  + S.s,t<  Tj:„  <  t  +  ^tj 

(i  -  l)!l!(i  - 1  -  l)!l!(r»  -  jy. 

{7(0  -  F(8  +  S8)y-'-^/(t)St  {1  -  F(t  +  St)}’-^' 


which  pves 


fij:n{8,t)  — 


n! 


j^{F(s)}*->{7(t)-7(s)} 


(,_l)!(;_i_  !)!(„_  j)I 

{l-F(t)}"->/(s)/(t) 


i-i-i 


(8) 
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By  siinilar  reasoning,  we  can  h^  'e  a  general  expression  for  the  joint  pdf  of  any  k  order  statistics  [3].  For 
1  <  nx  <  n2  <  •  •  •  <  njk  <  n  and  ti  <  tj  <  •  •  •  <  tjk, 


. („j  _  !)!(„,  -  n,  -  1)!  • . .  (n  -  nt)! 

/  (9) 


3.2  Comparison  with  a  Markov  model 

Before  evaluating  performability  or  MRBF,  let  us  verify  that  we  have  indeed  arrived  at  a  more  general 
solution  of  sojourn  time  boundaries  than  what  we  could  expect  by  transient  analysis  of  Markov  model  of 
the  system.  Vk^th  Markovian  assumptions,  the  chain  of  transitions  translates  to  a  pure  death  process  with  a 
constant  hazard  rate  A.  At  most  one  transition  due  to  failure  can  occur  in  the  small  interval  [t,  t  +  ^t].  This 
gives  rise  to  exponentially  distributed  component  lifetime  with  hazard  rate  A  [18], 


F(f)  =  l-e- 

xt 

The  transition  probabilities  are  given  by 

Pk,k+i  = 

(n  —  k)XSt, 

fc  =  0, . . . ,  n  —  1 

Pk.k  = 

1  —  (n  —  fc)A  St, 

fc  =  0, . . . ,  n  —  1 

Pn,n  — 

1 

PiJ  = 

0 

otherwise 

Denoting  Pr[rt;n  <  f]  by  p*(f),  for  fc  =  1, 

. . . ,  n,  we  have 

Pk{t  +  St) 

=  ^v[Tk:n<t  +  St] 

—  Pr  [fe  or  more  components  fail  in  [0,  t  +  5t]  ] 

=  1  X  Pr  [k  or  more  components  fail  in  [0,  t]  ]  + 

Pr  [One  failure  in  [t,  t  +  Jt]  |  Exactly  k—1  components  fail  in  [0,  t]  ] 
X  Pr  [Exactly  fc  —  1  components  fail  in  [0,  t]  ] 

=  Pr  [r*;„  <  tj  +  {Pr  [Tt_i;„  <  t]  -  Pr  [Tt:„  <  t]}  (n  -  fc  +  1)  A  St 
=  P*(0  +  {Pk-i(t)  -  pfc(t)}  (n  -  fc  + 1)  A  (if 

from  which  we  obtain 


(10) 


ri(,)  =  Ita  «(£±^Li£iW 
'  st->a  St 

=  (n-fc  +  l){pfc-i(f)-p*(f)}  (12) 

with  po(0)  =  1  and  pjb(O)  =  0,fc  0  as  the  initial  conditions. 

Our  formulation  of  section  3.1  is  of  course  more  general  in  that  it  can  accommodate  arbitrary  distribution. 
We  now  show  that  by  setting  F(f)  =  1  —  equation  (12)  can  be  derived  from  equations  (5)  and  (6). 
Differentiating  the  incomplete  beta  function  form  of  equation  (6) 


Pk{t)  = 


_ 1 _ 

B{k—  l,n  — fc  +  1) 


u*  *(1  — u)"  *du 


we  get 


Piw  = 


n! 

(fc-l)!(n-fc)l 


A^_(n-k+i)A.  _  e-*‘)‘->, 
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while  using  the  binomial  tail  probability  form  of  equation  (5) 


Pkit)  = 


(1  -  e-*‘)'(c-*‘)"** 


we  get 


(n  -  fc  +  1)A  {pfc-i(t)  -P4(0} 


=  (n-fc  +  l)A  I  ^  - 

=  („  -  fc  +  1)A  (  2  1  )  (1  - 

=  _ 2! _ l\  —  e“*M*“* 

(fc-l)!(n-fc)! 11  e  j  , 


»-fc+i 


verifying  that  pi(t)  =  (n  -  fc  +  1)  {p*_i(t)  -  p*(0}- 


3.3  Evaluation  of  performability 

We  are  now  ready  to  evaluate  the  performability  of  the  n-component  parallel  system.  Clearly  Tk-.„,k  = 
1, . . . ,  n  are  the  sojourn  time  boimdaries.  Defining  Tom  —  0,  the  sojourn  time  at  state  Qk  is 

fk  —  I'k+l-.n  ~  I'k-.nt  fc  0, . . . ,  M  —  1  (13) 

The  accumulated  reward  in  equation  (4)  can  now  be  written  as  (see  figure  1) 
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(14) 


n— I 

y  =  E»'‘ 

k=0 
n~l 

~  y  '.Pk{Tk+\:n  ~  Io:n  —  0 

k=0 

fi 

=  y^(P*-l  -pk)Tk:n,  pn=0 

k=l 

SO  that 

n 

Pr  (I'  <  vl  =  Pr  -  Pk)Tk:n  <  v] 

fc=i 

=  J“‘ J  (15) 

12k=i(Pk-i  -pk)tk  <  y 

Similarly,  the  MRBF  reduces  to 

E[Y]  =  £;E(pt_,-por*:n] 

ic=l 

n 

=  53(pt-i-p0^irt:„] 

k=l 

=  S(Pk-i--Pfc) /~</*:„(t)dt  (16) 

*=i 

4  Effect  of  Critical  Components 

Until  now  we  have  not  considered  system  failure  due  to  failure  of  any  critical  component.  The  effect  of  such 
a  component  failure  is  an  abrupt  system  failure  —  the  graceful  degradation  gets  truncated  at  that  point. 
This  phenomenon  is  very  common  in  practical  systems.  However,  the  results  derived  in  the  previous  section 
would  still  apply  to  systems  where  the  critical  component  has  much  higher  reliability  as  compared  to  the 
failing  components. 

To  incorporate  the  effect  of  critical  components,  we  have  to  change  the  expression  for  the  sojourn  time 
by 

Tk  =  0,  Tc<Tk:„ 

=  Tc-Tk:„  Tk:„<Tc<Tk+l:„  (17) 

~  Tt+i;n  “  Ti.;„  Tk+\-.n  ^  1'e 

where  Tc  denotes  the  lifetime  of  the  critical  component,  distributed  with  cdf  Fc(t)  and  pdf  /c(f)- 

It  is  also  quite  straightforward  to  handle  multiple  critical  components.  One  can  lump  the  ^ect  onto  a 
single  critical  component  of  effective  lifetime  Tc  with  cdf  Fc(t),  by  considering  the  hypothetical  critical  com¬ 
ponent  to  be  active  until  the  first  real  critical  component  failure.  Assuming  there  are  ric  critical  components 
with  individual  lifetimes  Te, , Tc,  •  •  •  with  cdf  Fc,  (t), Fc,(t)  •  •  •  Fc„,  (f)  (they  may  as  well  be  different),  the 
effective  distribution  of  critical  component  lifetime  is  given  by 

=  (18) 

i=\ 

and  fc{t)  can  be  obtained  by  differentiating  Fc(t)  wth  respect  to  time. 

Given  that  Tm-.n  <  Ic  <  Tm+i-.nt  Y  from  equation  (4)  reduces  to  (see  figure  2) 
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=  JJ  fvMUQdtidtc 

pote  <yt  te<  tl 

+  53  I  j j fl,—,fn+Unifly...ytm-{-l)fcif^dti“’dtm+ldte  ' 

"*  *  '  !Ck=l(P*— 1  ~  Pk)tk  +Pmte  ^  Vt  ^  ^m+l  - 

+  f  j  fl,—,n'-n(tli'"t^n)fei^e)dti'‘’dtndtc 
!Cfc=l(P*-l  ~  Pk)tk  ^  y»  <  t* 

An  expression  for  MRBF  can  be  obt^ed  by  conditioning  on  the  same  set  of  disjoint  events  as 
E[Y] 

=  £?[i'|re<r,:„)  •  Pr[7;<r,:„j 

n— 1 

+  5  ^  ElY  I  Tm;n  <  ^  rm+lrn]  *  Pr[rm:n  <  Ic  <  Tm+lin] 


(20) 


m=l 


+  E[Y\  T„:„  <  Tc]  •  Pr(r„=„  <  Tc] 

=  E[poTc  I  Tc  <  Ti:„l  •  Pr[Tc  <  Ti.-n) 

n— 1  m 

+  5  ]  •®l53(^*~*  ~~  Pk)Tk:n  PmTc  \  Tm:n  <  3ni+l:n] 


m=l  k=l 


*  Pr(Im:n  <  ^  ^  Im+l;nl 

n 

+  E  '{Pk-l  ~  Pk)Tlc:n  \  Tn:n  <  le]  '  Yt  [Pn:n  <  Tc] 


k=l 


—  Po  j  f  tefl'.n(tl)fe(te]dtidtc 

tc<h 


+  53 1 53  (^*-*  ~  /  /  /  /  ,m,in+l:n(tfci  tm» 

*"=*  I  tk<t„<tc<  «m+X 

+  (Pm— 1  “  Pm)  ///'  m/m,m+l:n(tm»  ^m+l)/c(^c)  ^fc 

<  fc  ^  tm+l 

+  Pm  ///'  c/m,m+l:n  Mtc)dt 

m  dtm+l  dte 

itn  ^  fc  ^  ^m+l  J 

+  ^(Pt-1  ~  Pt)  J  J  Jtkfk,n:nitkttn]fei^c)dtkdtndtc 

tfc  ^  fn  tc 

+  (Pn-I~Pn)  J  J  tnfn:n(tn)feitc)dt„dtc. 


tm+l)/e(tc)  dtfc  dtm  dtm+l  dte 


(21) 


tn  tc 


5  Numerical  Evaluation  and  Results 

To  illustrate  how  the  formulations  developed  here  can  be  used  for  computing  performability  distribution  and 
MRJBF,  we  consider  a  perfectly  scalable  system  made  of  n  components  and  a  single  critical  component.  In 
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other  words,  this  indicates  a  reward  assignment  of  the  form 

Pk  —  {n  —  k)p,  fe  =  0,  •  •  • ,  n 

The  analysis  only  suggests  that  the  sequence  po  >  Pi  t  •  •  •  <  Pn  is  monotonically  nonincreasing,  they  could  depend 
on  the  system  states  qo,qi,“-,qn  in  a  potentially  complex  way.  On  the  other  hand,  a  scalable  reward 
%<Mignnn»nt  can  be  used  as  a  consistency  chedc  on  the  correctness  of  numerical  routines,  by  verifying  whether 
or  not  linearity  between  number  of  components  and  MRBF  results. 

To  obtain  the  numerical  values,  the  series  of  integrals  in  equations  (15-16)  and  (20-21)  need  be  evalu¬ 
ated.  We  use  multidimensional  Gausrian  quadrature  to  perform  the  integrations  numerically  in  the  specified 
regions.  For  illustrations  and  comparisons,  we  assume  that  the  lifetimes  of  both  the  critical  and  non-critical 
components  are  modelled  by  WeibuU  distribution, 

Pr(r<t]  =  l-e-‘*‘>‘, 

the  parameters  A  and  a  being  different  in  general.  The  necessary  marginal  and  joint  density  functions  of  the 
order  statistics  can  then  be  used  to  compute  the  values  of  the  integrands  at  any  point.  The  grid  points  for 
the  quadrature  are  generated  by  recursively  using  Gauss-Legendre  formula  [16],  with  a  suitably  large  value 
chosen  for  infinity.  This  was  foimd  to  be  numerically  stable  and  well-behaved  for  the  exponential  class  of 
integrands,  as  compared  to  a  combination  of  Gauss-Laguerre  and  Gauss-Legendre  quadrature  designed  for 
semi-open  regions. 


Figure  3:  Performability  distribution  {^components  =  2,  a  =  1  vs.  2) 

Figure  3  shows  the  effect  of  critical  component  on  the  performability  distribution  function  for  a  two 
component  system,  with  a  =  1  and  2  in  (a)  and  (b)  respectively.  We  restrict  the  number  of  components  to 
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n  s  2,  because  for  higher  values,  long  CPU  hotirs  are  needed  to  evaluate  the  related  integrals  of  upto  (n  + 1) 
dimensions  to  the  desired  accuracy.  The  unit  used  for  the  accumulated  reward  is  component-day,  the  total 
throughput  avulable  from  a  component  over  one  day.  As  the  critical  components  need  be  more  reliable  than 
the  non-critical  components,  we  increase  the  MTTF  of  the  critical  component  keeping  the  MTTF  of  the  non- 
critical  components  unchanged.  The  cdf  of  the  accumulated  reward  of  the  system,  as  a  result,  approadies 
in  limit  that  of  a  system  with  no  critical  components.  The  parameter  0  =  1  models  the  Markovian  system 
with  constant  hazard  rate,  whereas  a  =  2  corresponds  to  a  case  of  increasing  hazard  rate.  The  motivation 
behind  studying  systems  with  the  increasing  hazard  rate  is  that  Markovian  approximation  for  them  may 
lead  to  over-optimistic  results.  With  increasing  MTTF  of  critical  component,  the  performability  distribution 
approaches  the  limiting  case  faster  for  the  increasing  hazard  rate. 


Figure  4:  Mean  reward  before  failure  (o  =  1  vs.  2) 

The  observations  are,  however,  not  limited  to  systems  with  two  components.  Computing  MRBF  for 
systems  with  critical  components  requires  evaluation  of  only  upto  4'dimensional  integrals,  although  the 
total  number  of  such  integral  evaluations  grows  as  O(n^).  As  a  result,  computing  MRBF  for  larger  system 
is  still  possible  whereas  computing  performability  distribution  is  not.  We  have  computed  the  MRBFs  of 
systems  with  upto  eight  identical  components,  and  they  have  been  plotted  against  the  number  of  components 
in  figure  4  for  a  =  1  and  2  in  (a)  and  (b)  respectively.  Naturally,  component-day  is  also  the  choice  of 
unit  for  MRBF.  With  increasing  MTTF  of  critical  component,  the  MRBF  of  the  system  approaches  the 
limiting  MRBF  of  the  system  with  no  critical  component.  As  we  have  observed  in  the  case  of  performability 
distribution,  this  is  more  prominent  with  increasing  hazard  rate  as  compared  to  the  Markovian  or  constant 
hazard  rate  case.  Clearly  MRBF  is  a  good  indicator  of  performability  in  that  it  preserves  the  trend.  This  also 
means  that  the  reliability  requirement  of  the  critical  component  to  achieve  the  best  possible  performability 
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is  less  stringent  as  the  hazard  rate  increases. 


6  Conclusions 

The  usefulness  of  a  closed  form  analytical  solution  for  any  system  measure  cannot  be  overemphasized.  In 
this  paper,  we  have  developed  a  methodology  based  on  order  statistics  to  analytically  evaluate  performability 
figures  for  a  non-repairable  multicomponent  parallel  system  with  one  or  more  critical  elements.  This  descrip¬ 
tion  indeed  characterizes  a  simple  yet  broad  class  of  parallel  systems.  An  SIMD  computer  can  be  cited  as  one 
example  where  the  processing  elements  (PEs)  form  the  pool  of  degradable  components,  whereas  the  master 
CPU,  the  controller  and  the  clock  driver  can  be  identified  as  the  critical  components.  The  applicability  of 
the  approach  is,  however,  not  limited  only  to  the  configuration  mentioned  above.  Under  suitable  bounding 
arguments,  similar  degradation  steps  can  be  foimd  embedded  in  more  complex  parallel  processing  systems. 
The  scope  of  extending  the  formulation  exists  in  some  arrays  and  interconnection  networks. 

Our  approach  brea^  the  tradition  in  that  it  does  not  call  for  making  assumptions  about  the  sojourn 
time  distributions,  whether  memoryless  or  otherwise.  The  k^  assumption  is  that  of  the  independence  of 
component  lifetime,  which  is  more  tangible  to  a  system  designer.  Lifetime  of  a  component  can  have  any 
general  (known)  distribution,  possibly  supplied  by  the  manufacturer.  This  is  clearly  in  contrast  to  the  Markov 
model,  where  hazard  rates  for  components  are  assumed  constant.  Thus,  in  a  real  system  with  increasing 
hazard  rate,  Markovian  analysis  would  come  up  with  over-optimistic  estimates,  whereas  our  analysis  should 
be  more  accurate.  The  technique  can  also  prove  useful  to  study  the  effect  of  the  critical  component  and  to 
do  a  requirement  analysis  on  its  MTTF. 

A  Appendix:  Incomplete  Beta  function 

We  have  used  an  identity  relating  the  partial  binomial  sum  to  the  incomplete  beta  function.  For  0  <  p  < 
1,  a  >  0,  6  >  0,  the  incomplete  beta  function  Ip  (a,b)  is  defined  to  be 

where  ^ 

B(a,b)=  [ 

Jo 

is  the  complete  beta  function.  A  proof  for  the  identity 

Ip{k,n-k  +  l)  =  j2(  i 

i=k  ' 


for  integer  n  >  fc,  is  given  here. 

Proof;  Integrating  LHS  by  parts,  we  have 

/p(k,n-*-f- 1) 

1 


{ [xo  -  + r  Iff"  -  *)(*  -  «'} 


r(n-H) 


r(t + 1)  r(i- 1)  X  ^ 
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Observing  that 


(fc)p‘{l-p}"-‘  +  /p(*  +  l.n-fc) 

"  BiM)/ 

_  r(n  +  l)  ft"]^  .  -. 

rnri  L"Jo 


=  p" 


The  recurrence  can  be  solved  as 
Jp  (fc,  n  -  fc  + 1) 

=  /,  (t  +  2,  n  -  Ik  - 1)  +  (  2  )  P'Cl  -!>)"■*  +  (  t  + 1  )  f**'  (1  - 


=  E( 

issfc  '  ' 
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